(OHHH NUTS!)
1) World Health Organization - "Life Expectancy" of 2015 Source: Kaggle
2) Global Health Security Index - "Building Collective Action & Accountability" Index developed with The Economist Intelligence Unit for 2019 Source: Johns Hopkins Bloomberg School of Public Health Co-Funded by Bill & Melinda Gates Foundation, Robertson Foundation, & Open Philanthropy Project
3) COVID Global Data Set Source: Kaggle "Nutrition/Diet During COVID-19" This dataset shows the percentages of fat consumed from each type of food listed. The end of the dataset also includes obesity and undernourished percentage, and the percentage of COVID-19 Confirmed/Deaths/Recovered/Active cases. (Note: All the data have unit % except Population, which is just the population count). https://www.kaggle.com/mariaren/covid19-healthy-diet-dataset/data https://storage.googleapis.com/kagglesdsdata/datasets/618335/1175348/Fat_Supply_Quantity_Data.csv?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1590357735&Signature=bWQ%2Fj3yRSui127xJMXfJ%2BtW6L0L5xx68ngVOg8U3ewhDulVfTGGUSSVyTSQmO5%2FPnt0%2FAXI7zhGfc6Keejv7bDqaK5mB6pZX4YzqHWPfiBo4s8Oa20eVOXlr2fMFdAZSiOLa6d2fpcia9QfebPZflAH4SgGEO5nuqbJLj%2FDOF0Mv1VHkRRp2BZ8dWsogc2jqxDIqISkOnFxasYodcbzL0lKKzPcfF7QJy%2ByzZj3osBjQ0OXCLwjiAT2DdFdwe%2FQ7Bkthv8fNAW8w9uXExH79t%2BL4QV%2F%2F10BKldwlht3mvzg18ZQYIvrmMtA%2BVLsNYBcvSIH5p%2FJCGN1v6fKnTNZXsw%3D%3D&response-content-disposition=attachment%3B+filename%3DFat_Supply_Quantity_Data.csv
~#1 Key variables to measure healthcare experiencing shortages--in both low and middle-income countries: Ventilator and critical care beds.
~#2 May 6: Iran reached 100,00 COVID-19 cases and still ranked in top 5 globally for over 1 month. Source: https://twitter.com/WHOEMRO/status/1258096811786592256/photo/1
~Observed COVID-19 impact on healthcare workers in my family follow a traditional Eastern Mediterranean diet that includes: Citrus fruits, Almonds, Nigella seeds on cheese or bread, Pistachios, and Pine nuts. Source: https://anjomanfood.com/nuts-in-the-middle-eastern-and-mediterranean-diet/
~Noticed Iran, Saudi Arabia, UAE (travel hub), Egypt, and Pakistan experienced cases with varying medical responses. Eastern Mediterranean region: Afghanistan, Bahrain, Egypt, Iran, Iraq, Jordan, Kuwait, Lebanon, Libya, Morocco, Pakistan, Palestine, Saudi Arabia, Syria, Tunisia, UAE, and Yemen.
~Tracked COVID-19 incidences across WHO maps.
~Excited to test my physican father's theory (and personal diet) out since we were all infected in January-February.
df
| Country | Alcoholic Beverages | Animal Products | Animal fats | Aquatic Products, Other | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | ... | Vegetable Oils | Vegetables | Obesity | Undernourished | Confirmed | Deaths | Recovered | Active | Population | Unit (all except Population) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 0.0000 | 21.6397 | 6.2224 | 0.0 | 8.0353 | 0.6859 | 0.0327 | 0.4246 | 6.1244 | ... | 17.0831 | 0.3593 | 4.5 | 29.8 | 0.021411 | 0.000492 | 0.002445 | 0.018474 | 38042000.0 | % |
| 1 | Albania | 0.0000 | 32.0002 | 3.4172 | 0.0 | 2.6734 | 1.6448 | 0.1445 | 0.6418 | 8.7428 | ... | 9.2443 | 0.6503 | 22.3 | 6.2 | 0.033730 | 0.001085 | 0.026522 | 0.006123 | 2858000.0 | % |
| 2 | Algeria | 0.0000 | 14.4175 | 0.8972 | 0.0 | 4.2035 | 1.2171 | 0.2008 | 0.5772 | 3.8961 | ... | 27.3606 | 0.5145 | 26.6 | 3.9 | 0.017375 | 0.001309 | 0.009142 | 0.006925 | 43406000.0 | % |
| 3 | Angola | 0.0000 | 15.3041 | 1.3130 | 0.0 | 6.5545 | 0.1539 | 1.4155 | 0.3488 | 11.0268 | ... | 22.4638 | 0.1231 | 6.8 | 25 | 0.000165 | 0.000010 | 0.000054 | 0.000102 | 31427000.0 | % |
| 4 | Antigua and Barbuda | 0.0000 | 27.7033 | 4.6686 | 0.0 | 3.2153 | 0.3872 | 1.5263 | 1.2177 | 14.3202 | ... | 14.4436 | 0.2469 | 19.1 | NaN | 0.025773 | 0.003093 | 0.019588 | 0.003093 | 97000.0 | % |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 165 | Venezuela (Bolivarian Republic of) | 0.0000 | 16.3261 | 2.2673 | 0.0 | 2.5449 | 0.6555 | 0.5707 | 0.9640 | 7.0949 | ... | 29.5211 | 0.1851 | 25.2 | 21.2 | 0.002890 | 0.000035 | 0.000919 | 0.001936 | 28516000.0 | % |
| 166 | Vietnam | 0.0000 | 33.2484 | 3.8238 | 0.0 | 3.7155 | 0.7839 | 1.1217 | 0.4079 | 26.4292 | ... | 5.6211 | 0.6373 | 2.1 | 9.3 | 0.000339 | 0.000000 | 0.000275 | 0.000064 | 95656000.0 | % |
| 167 | Yemen | 0.0000 | 12.5401 | 2.0131 | 0.0 | 11.5271 | 0.5514 | 0.3847 | 0.2564 | 8.0010 | ... | 23.6312 | 0.1667 | 14.1 | 38.9 | 0.000631 | 0.000103 | 0.000017 | 0.000511 | 29162000.0 | % |
| 168 | Zambia | 0.0783 | 9.6005 | 1.6113 | 0.0 | 14.3225 | 0.6266 | 1.0070 | 0.1343 | 4.9010 | ... | 15.2848 | 0.1567 | 6.5 | 46.7 | 0.004658 | 0.000039 | 0.001103 | 0.003516 | 17861000.0 | % |
| 169 | Zimbabwe | 0.0000 | 10.3796 | 2.9543 | 0.0 | 9.7922 | 0.3682 | 0.2455 | 0.0614 | 4.5674 | ... | 26.9396 | 0.0789 | 12.3 | 51.3 | 0.000328 | 0.000027 | 0.000123 | 0.000178 | 14645000.0 | % |
170 rows × 32 columns
df.columns
Index(['Country', 'Alcoholic Beverages', 'Animal Products', 'Animal fats',
'Aquatic Products, Other', 'Cereals - Excluding Beer', 'Eggs',
'Fish, Seafood', 'Fruits - Excluding Wine', 'Meat', 'Miscellaneous',
'Milk - Excluding Butter', 'Offals', 'Oilcrops', 'Pulses', 'Spices',
'Starchy Roots', 'Stimulants', 'Sugar Crops', 'Sugar & Sweeteners',
'Treenuts', 'Vegetal Products', 'Vegetable Oils', 'Vegetables',
'Obesity', 'Undernourished', 'Confirmed', 'Deaths', 'Recovered',
'Active', 'Population', 'Unit (all except Population)'],
dtype='object')
postgres_user = 'dsbc_student'
postgres_pw = '7*.8G9QH21'
postgres_host = '142.93.121.174'
postgres_port = '5432'
postgres_db = 'lifeexpectancy'
engine = create_engine('postgresql://{}:{}@{}:{}/{}'.format(
postgres_user, postgres_pw, postgres_host, postgres_port, postgres_db))
life_df = pd.read_sql_query('select * from lifeexpectancy',con=engine)
engine.dispose()
life_df
| Country | Year | Status | Life expectancy | Adult Mortality | infant deaths | Alcohol | percentage expenditure | Hepatitis B | Measles | ... | Polio | Total expenditure | Diphtheria | HIV/AIDS | GDP | Population | thinness 1-19 years | thinness 5-9 years | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 2015 | Developing | 65.0 | 263.0 | 62 | 0.01 | 71.279624 | 65.0 | 1154 | ... | 6.0 | 8.16 | 65.0 | 0.1 | 584.259210 | 33736494.0 | 17.2 | 17.3 | 0.479 | 10.1 |
| 1 | Afghanistan | 2014 | Developing | 59.9 | 271.0 | 64 | 0.01 | 73.523582 | 62.0 | 492 | ... | 58.0 | 8.18 | 62.0 | 0.1 | 612.696514 | 327582.0 | 17.5 | 17.5 | 0.476 | 10.0 |
| 2 | Afghanistan | 2013 | Developing | 59.9 | 268.0 | 66 | 0.01 | 73.219243 | 64.0 | 430 | ... | 62.0 | 8.13 | 64.0 | 0.1 | 631.744976 | 31731688.0 | 17.7 | 17.7 | 0.470 | 9.9 |
| 3 | Afghanistan | 2012 | Developing | 59.5 | 272.0 | 69 | 0.01 | 78.184215 | 67.0 | 2787 | ... | 67.0 | 8.52 | 67.0 | 0.1 | 669.959000 | 3696958.0 | 17.9 | 18.0 | 0.463 | 9.8 |
| 4 | Afghanistan | 2011 | Developing | 59.2 | 275.0 | 71 | 0.01 | 7.097109 | 68.0 | 3013 | ... | 68.0 | 7.87 | 68.0 | 0.1 | 63.537231 | 2978599.0 | 18.2 | 18.2 | 0.454 | 9.5 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2933 | Zimbabwe | 2004 | Developing | 44.3 | 723.0 | 27 | 4.36 | 0.000000 | 68.0 | 31 | ... | 67.0 | 7.13 | 65.0 | 33.6 | 454.366654 | 12777511.0 | 9.4 | 9.4 | 0.407 | 9.2 |
| 2934 | Zimbabwe | 2003 | Developing | 44.5 | 715.0 | 26 | 4.06 | 0.000000 | 7.0 | 998 | ... | 7.0 | 6.52 | 68.0 | 36.7 | 453.351155 | 12633897.0 | 9.8 | 9.9 | 0.418 | 9.5 |
| 2935 | Zimbabwe | 2002 | Developing | 44.8 | 73.0 | 25 | 4.43 | 0.000000 | 73.0 | 304 | ... | 73.0 | 6.53 | 71.0 | 39.8 | 57.348340 | 125525.0 | 1.2 | 1.3 | 0.427 | 10.0 |
| 2936 | Zimbabwe | 2001 | Developing | 45.3 | 686.0 | 25 | 1.72 | 0.000000 | 76.0 | 529 | ... | 76.0 | 6.16 | 75.0 | 42.1 | 548.587312 | 12366165.0 | 1.6 | 1.7 | 0.427 | 9.8 |
| 2937 | Zimbabwe | 2000 | Developing | 46.0 | 665.0 | 24 | 1.68 | 0.000000 | 79.0 | 1483 | ... | 78.0 | 7.10 | 78.0 | 43.5 | 547.358878 | 12222251.0 | 11.0 | 11.2 | 0.434 | 9.8 |
2938 rows × 22 columns
#df.loc[df['shield'] > 6]
life_df.loc[life_df['Life expectancy '] > 70]
| Country | Year | Status | Life expectancy | Adult Mortality | infant deaths | Alcohol | percentage expenditure | Hepatitis B | Measles | ... | Polio | Total expenditure | Diphtheria | HIV/AIDS | GDP | Population | thinness 1-19 years | thinness 5-9 years | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16 | Albania | 2015 | Developing | 77.8 | 74.0 | 0 | 4.60 | 364.975229 | 99.0 | 0 | ... | 99.0 | 6.00 | 99.0 | 0.1 | 3954.227830 | 28873.0 | 1.2 | 1.3 | 0.762 | 14.2 |
| 17 | Albania | 2014 | Developing | 77.5 | 8.0 | 0 | 4.51 | 428.749067 | 98.0 | 0 | ... | 98.0 | 5.88 | 98.0 | 0.1 | 4575.763787 | 288914.0 | 1.2 | 1.3 | 0.761 | 14.2 |
| 18 | Albania | 2013 | Developing | 77.2 | 84.0 | 0 | 4.76 | 430.876979 | 99.0 | 0 | ... | 99.0 | 5.66 | 99.0 | 0.1 | 4414.723140 | 289592.0 | 1.3 | 1.4 | 0.759 | 14.2 |
| 19 | Albania | 2012 | Developing | 76.9 | 86.0 | 0 | 5.14 | 412.443356 | 99.0 | 9 | ... | 99.0 | 5.59 | 99.0 | 0.1 | 4247.614380 | 2941.0 | 1.3 | 1.4 | 0.752 | 14.2 |
| 20 | Albania | 2011 | Developing | 76.6 | 88.0 | 0 | 5.37 | 437.062100 | 99.0 | 28 | ... | 99.0 | 5.71 | 99.0 | 0.1 | 4437.178680 | 295195.0 | 1.4 | 1.5 | 0.738 | 13.3 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2885 | Viet Nam | 2004 | Developing | 74.2 | 136.0 | 29 | 2.86 | 0.000000 | 94.0 | 217 | ... | 96.0 | 5.90 | 96.0 | 0.2 | NaN | NaN | 15.4 | 16.1 | 0.601 | 11.0 |
| 2886 | Viet Nam | 2003 | Developing | 74.0 | 137.0 | 30 | 2.19 | 0.000000 | 78.0 | 2297 | ... | 96.0 | 4.84 | 99.0 | 0.2 | NaN | NaN | 15.6 | 16.2 | 0.592 | 10.9 |
| 2887 | Viet Nam | 2002 | Developing | 73.8 | 137.0 | 30 | 2.03 | 0.000000 | NaN | 6755 | ... | 92.0 | 4.70 | 75.0 | 0.2 | NaN | NaN | 15.6 | 16.3 | 0.584 | 10.7 |
| 2888 | Viet Nam | 2001 | Developing | 73.6 | 138.0 | 32 | 1.84 | 0.000000 | NaN | 12058 | ... | 96.0 | 5.17 | 96.0 | 0.1 | NaN | NaN | 15.7 | 16.4 | 0.576 | 10.6 |
| 2889 | Viet Nam | 2000 | Developing | 73.4 | 139.0 | 33 | 1.60 | 0.000000 | NaN | 16512 | ... | 96.0 | 4.89 | 96.0 | 0.1 | NaN | NaN | 15.8 | 16.4 | 0.569 | 10.4 |
1620 rows × 22 columns
life_df.columns
Index(['Country', 'Year', 'Status', 'Life expectancy ', 'Adult Mortality',
'infant deaths', 'Alcohol', 'percentage expenditure', 'Hepatitis B',
'Measles ', ' BMI ', 'under-five deaths ', 'Polio', 'Total expenditure',
'Diphtheria ', ' HIV/AIDS', 'GDP', 'Population',
' thinness 1-19 years', ' thinness 5-9 years',
'Income composition of resources', 'Schooling'],
dtype='object')
1) Join data sets df and life_df to be covid_df. Perform any data cleaning. 2) Explore data to discover relationships and features. We will explore the data using both statistics and data visualization. 3) Perform feature engineering by reviewing the most important variables by conducting 'feature importance' and transforming into features to predict, classify, or measure our target variable.
Both data sets include countries as columns, which will serve as the key. However, the 'life_df' set includes multiple years for each country, so we will select the most recent year for each country to join as the key when we append the data sets.
life_df2 = life_df['Year'].max()
life_df2
2015
life_df.nunique()
Country 193 Year 16 Status 2 Life expectancy 362 Adult Mortality 425 infant deaths 209 Alcohol 1076 percentage expenditure 2328 Hepatitis B 87 Measles 958 BMI 608 under-five deaths 252 Polio 73 Total expenditure 818 Diphtheria 81 HIV/AIDS 200 GDP 2490 Population 2278 thinness 1-19 years 200 thinness 5-9 years 207 Income composition of resources 625 Schooling 173 dtype: int64
df.nunique()
Country 170 Alcoholic Beverages 3 Animal Products 170 Animal fats 169 Aquatic Products, Other 6 Cereals - Excluding Beer 170 Eggs 169 Fish, Seafood 170 Fruits - Excluding Wine 168 Meat 170 Miscellaneous 137 Milk - Excluding Butter 169 Offals 167 Oilcrops 170 Pulses 160 Spices 155 Starchy Roots 166 Stimulants 169 Sugar Crops 11 Sugar & Sweeteners 9 Treenuts 162 Vegetal Products 170 Vegetable Oils 170 Vegetables 168 Obesity 120 Undernourished 98 Confirmed 161 Deaths 145 Recovered 161 Active 155 Population 170 Unit (all except Population) 1 dtype: int64
df.columns
Index(['Country', 'Alcoholic Beverages', 'Animal Products', 'Animal fats',
'Aquatic Products, Other', 'Cereals - Excluding Beer', 'Eggs',
'Fish, Seafood', 'Fruits - Excluding Wine', 'Meat', 'Miscellaneous',
'Milk - Excluding Butter', 'Offals', 'Oilcrops', 'Pulses', 'Spices',
'Starchy Roots', 'Stimulants', 'Sugar Crops', 'Sugar & Sweeteners',
'Treenuts', 'Vegetal Products', 'Vegetable Oils', 'Vegetables',
'Obesity', 'Undernourished', 'Confirmed', 'Deaths', 'Recovered',
'Active', 'Population', 'Unit (all except Population)'],
dtype='object')
life_df.columns
Index(['Country', 'Year', 'Status', 'Life expectancy ', 'Adult Mortality',
'infant deaths', 'Alcohol', 'percentage expenditure', 'Hepatitis B',
'Measles ', ' BMI ', 'under-five deaths ', 'Polio', 'Total expenditure',
'Diphtheria ', ' HIV/AIDS', 'GDP', 'Population',
' thinness 1-19 years', ' thinness 5-9 years',
'Income composition of resources', 'Schooling'],
dtype='object')
Alcoholic Beverages, Sugar Crops, Sugar & Sweeteners ,Undernourished, Miscellaneous,Aquatic Products, Other,Undernourished,
#Drop less used from first data frame: Alcoholic Beverages, Sugar Crops, Sugar & Sweeteners ,Undernourished, Miscellaneous,Aquatic Products, Other,Undernourished,
df2 = df.drop(['Alcoholic Beverages','Aquatic Products, Other','Sugar & Sweeteners','Sugar Crops','Miscellaneous','Undernourished','Aquatic Products, Other',], axis=1)
df2
| Country | Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | ... | Vegetal Products | Vegetable Oils | Vegetables | Obesity | Confirmed | Deaths | Recovered | Active | Population | Unit (all except Population) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 21.6397 | 6.2224 | 8.0353 | 0.6859 | 0.0327 | 0.4246 | 6.1244 | 8.2803 | 0.3103 | ... | 28.3684 | 17.0831 | 0.3593 | 4.5 | 0.021411 | 0.000492 | 0.002445 | 0.018474 | 38042000.0 | % |
| 1 | Albania | 32.0002 | 3.4172 | 2.6734 | 1.6448 | 0.1445 | 0.6418 | 8.7428 | 17.7576 | 0.2933 | ... | 17.9998 | 9.2443 | 0.6503 | 22.3 | 0.033730 | 0.001085 | 0.026522 | 0.006123 | 2858000.0 | % |
| 2 | Algeria | 14.4175 | 0.8972 | 4.2035 | 1.2171 | 0.2008 | 0.5772 | 3.8961 | 8.0934 | 0.1067 | ... | 35.5857 | 27.3606 | 0.5145 | 26.6 | 0.017375 | 0.001309 | 0.009142 | 0.006925 | 43406000.0 | % |
| 3 | Angola | 15.3041 | 1.3130 | 6.5545 | 0.1539 | 1.4155 | 0.3488 | 11.0268 | 1.2309 | 0.1539 | ... | 34.7010 | 22.4638 | 0.1231 | 6.8 | 0.000165 | 0.000010 | 0.000054 | 0.000102 | 31427000.0 | % |
| 4 | Antigua and Barbuda | 27.7033 | 4.6686 | 3.2153 | 0.3872 | 1.5263 | 1.2177 | 14.3202 | 6.6607 | 0.1347 | ... | 22.2995 | 14.4436 | 0.2469 | 19.1 | 0.025773 | 0.003093 | 0.019588 | 0.003093 | 97000.0 | % |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 165 | Venezuela (Bolivarian Republic of) | 16.3261 | 2.2673 | 2.5449 | 0.6555 | 0.5707 | 0.9640 | 7.0949 | 5.5217 | 0.2082 | ... | 33.6855 | 29.5211 | 0.1851 | 25.2 | 0.002890 | 0.000035 | 0.000919 | 0.001936 | 28516000.0 | % |
| 166 | Vietnam | 33.2484 | 3.8238 | 3.7155 | 0.7839 | 1.1217 | 0.4079 | 26.4292 | 0.7520 | 0.3378 | ... | 16.7548 | 5.6211 | 0.6373 | 2.1 | 0.000339 | 0.000000 | 0.000275 | 0.000064 | 95656000.0 | % |
| 167 | Yemen | 12.5401 | 2.0131 | 11.5271 | 0.5514 | 0.3847 | 0.2564 | 8.0010 | 1.3463 | 0.2436 | ... | 37.4535 | 23.6312 | 0.1667 | 14.1 | 0.000631 | 0.000103 | 0.000017 | 0.000511 | 29162000.0 | % |
| 168 | Zambia | 9.6005 | 1.6113 | 14.3225 | 0.6266 | 1.0070 | 0.1343 | 4.9010 | 1.2756 | 0.1790 | ... | 40.3939 | 15.2848 | 0.1567 | 6.5 | 0.004658 | 0.000039 | 0.001103 | 0.003516 | 17861000.0 | % |
| 169 | Zimbabwe | 10.3796 | 2.9543 | 9.7922 | 0.3682 | 0.2455 | 0.0614 | 4.5674 | 2.1040 | 0.1315 | ... | 39.6248 | 26.9396 | 0.0789 | 12.3 | 0.000328 | 0.000027 | 0.000123 | 0.000178 | 14645000.0 | % |
170 rows × 26 columns
#Drop less used from second data frame, Life data frame:
life_df2 = life_df.drop(['Measles ','Hepatitis B','infant deaths',' thinness 1-19 years','Alcohol', ' thinness 5-9 years','GDP','Total expenditure'], axis=1)
life_df2
| Country | Year | Status | Life expectancy | Adult Mortality | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 2015 | Developing | 65.0 | 263.0 | 71.279624 | 19.1 | 83 | 6.0 | 65.0 | 0.1 | 33736494.0 | 0.479 | 10.1 |
| 1 | Afghanistan | 2014 | Developing | 59.9 | 271.0 | 73.523582 | 18.6 | 86 | 58.0 | 62.0 | 0.1 | 327582.0 | 0.476 | 10.0 |
| 2 | Afghanistan | 2013 | Developing | 59.9 | 268.0 | 73.219243 | 18.1 | 89 | 62.0 | 64.0 | 0.1 | 31731688.0 | 0.470 | 9.9 |
| 3 | Afghanistan | 2012 | Developing | 59.5 | 272.0 | 78.184215 | 17.6 | 93 | 67.0 | 67.0 | 0.1 | 3696958.0 | 0.463 | 9.8 |
| 4 | Afghanistan | 2011 | Developing | 59.2 | 275.0 | 7.097109 | 17.2 | 97 | 68.0 | 68.0 | 0.1 | 2978599.0 | 0.454 | 9.5 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2933 | Zimbabwe | 2004 | Developing | 44.3 | 723.0 | 0.000000 | 27.1 | 42 | 67.0 | 65.0 | 33.6 | 12777511.0 | 0.407 | 9.2 |
| 2934 | Zimbabwe | 2003 | Developing | 44.5 | 715.0 | 0.000000 | 26.7 | 41 | 7.0 | 68.0 | 36.7 | 12633897.0 | 0.418 | 9.5 |
| 2935 | Zimbabwe | 2002 | Developing | 44.8 | 73.0 | 0.000000 | 26.3 | 40 | 73.0 | 71.0 | 39.8 | 125525.0 | 0.427 | 10.0 |
| 2936 | Zimbabwe | 2001 | Developing | 45.3 | 686.0 | 0.000000 | 25.9 | 39 | 76.0 | 75.0 | 42.1 | 12366165.0 | 0.427 | 9.8 |
| 2937 | Zimbabwe | 2000 | Developing | 46.0 | 665.0 | 0.000000 | 25.5 | 39 | 78.0 | 78.0 | 43.5 | 12222251.0 | 0.434 | 9.8 |
2938 rows × 14 columns
life_df.tail(140)
| Country | Year | Status | Life expectancy | Adult Mortality | infant deaths | Alcohol | percentage expenditure | Hepatitis B | Measles | ... | Polio | Total expenditure | Diphtheria | HIV/AIDS | GDP | Population | thinness 1-19 years | thinness 5-9 years | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2798 | United States of America | 2011 | Developed | 78.7 | 16.0 | 25 | 8.67 | 0.0 | 91.0 | 220 | ... | 94.0 | 17.60 | 96.0 | 0.1 | NaN | NaN | 0.7 | 0.6 | NaN | NaN |
| 2799 | United States of America | 2010 | Developed | 78.7 | 15.0 | 25 | 8.55 | 0.0 | 92.0 | 63 | ... | 93.0 | 17.20 | 95.0 | 0.1 | NaN | NaN | 0.7 | 0.6 | NaN | NaN |
| 2800 | United States of America | 2009 | Developed | 78.5 | 18.0 | 26 | 8.71 | 0.0 | 92.0 | 71 | ... | 93.0 | 17.00 | 95.0 | 0.1 | NaN | NaN | 0.7 | 0.6 | NaN | NaN |
| 2801 | United States of America | 2008 | Developed | 78.2 | 18.0 | 27 | 8.74 | 0.0 | 94.0 | 140 | ... | 94.0 | 16.20 | 96.0 | 0.1 | NaN | NaN | 0.7 | 0.6 | NaN | NaN |
| 2802 | United States of America | 2007 | Developed | 78.1 | 11.0 | 27 | 8.74 | 0.0 | 93.0 | 43 | ... | 93.0 | 15.57 | 96.0 | 0.1 | NaN | NaN | 0.7 | 0.6 | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2933 | Zimbabwe | 2004 | Developing | 44.3 | 723.0 | 27 | 4.36 | 0.0 | 68.0 | 31 | ... | 67.0 | 7.13 | 65.0 | 33.6 | 454.366654 | 12777511.0 | 9.4 | 9.4 | 0.407 | 9.2 |
| 2934 | Zimbabwe | 2003 | Developing | 44.5 | 715.0 | 26 | 4.06 | 0.0 | 7.0 | 998 | ... | 7.0 | 6.52 | 68.0 | 36.7 | 453.351155 | 12633897.0 | 9.8 | 9.9 | 0.418 | 9.5 |
| 2935 | Zimbabwe | 2002 | Developing | 44.8 | 73.0 | 25 | 4.43 | 0.0 | 73.0 | 304 | ... | 73.0 | 6.53 | 71.0 | 39.8 | 57.348340 | 125525.0 | 1.2 | 1.3 | 0.427 | 10.0 |
| 2936 | Zimbabwe | 2001 | Developing | 45.3 | 686.0 | 25 | 1.72 | 0.0 | 76.0 | 529 | ... | 76.0 | 6.16 | 75.0 | 42.1 | 548.587312 | 12366165.0 | 1.6 | 1.7 | 0.427 | 9.8 |
| 2937 | Zimbabwe | 2000 | Developing | 46.0 | 665.0 | 24 | 1.68 | 0.0 | 79.0 | 1483 | ... | 78.0 | 7.10 | 78.0 | 43.5 | 547.358878 | 12222251.0 | 11.0 | 11.2 | 0.434 | 9.8 |
140 rows × 22 columns
We will join both dataframes and select only Year == 2015 from the life_df2 dataframe to create a new dataframe: covid_df.
covid_df.corr()
| Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | Oilcrops | ... | Adult Mortality | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population_y | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Animal Products | 1.000000 | 0.696317 | -0.464931 | 0.470419 | -0.021730 | -0.112444 | 0.736699 | 0.634589 | 0.065823 | -0.436742 | ... | -0.401995 | 0.113623 | 0.427160 | -0.204313 | 0.385766 | 0.286291 | -0.374073 | -0.175792 | 0.689126 | 0.616102 |
| Animal fats | 0.696317 | 1.000000 | -0.407189 | 0.281349 | -0.118791 | -0.168326 | 0.231632 | 0.342875 | -0.179648 | -0.340163 | ... | -0.353027 | -0.008614 | 0.369455 | -0.039779 | 0.279839 | 0.239359 | -0.313355 | -0.083317 | 0.594815 | 0.561922 |
| Cereals - Excluding Beer | -0.464931 | -0.407189 | 1.000000 | -0.301811 | -0.041983 | 0.016833 | -0.278022 | -0.276145 | 0.281311 | 0.113556 | ... | 0.400019 | -0.025796 | -0.429014 | 0.174882 | -0.423820 | -0.275711 | 0.511248 | 0.063692 | -0.629482 | -0.569581 |
| Eggs | 0.470419 | 0.281349 | -0.301811 | 1.000000 | 0.206953 | -0.061869 | 0.242745 | 0.273953 | -0.122609 | -0.342333 | ... | -0.369550 | 0.076464 | 0.230078 | -0.109621 | 0.259077 | 0.208234 | -0.381427 | 0.056293 | 0.562982 | 0.457319 |
| Fish, Seafood | -0.021730 | -0.118791 | -0.041983 | 0.206953 | 1.000000 | 0.025822 | 0.015310 | -0.272491 | -0.091862 | 0.346553 | ... | -0.044788 | -0.071822 | -0.130329 | -0.063458 | -0.013412 | -0.044516 | -0.075533 | 0.033109 | 0.003294 | -0.006498 |
| Fruits - Excluding Wine | -0.112444 | -0.168326 | 0.016833 | -0.061869 | 0.025822 | 1.000000 | -0.026609 | -0.062986 | 0.075238 | 0.050115 | ... | 0.027664 | 0.006149 | -0.068760 | -0.039160 | 0.034929 | -0.028214 | -0.047952 | 0.022113 | -0.096971 | -0.061888 |
| Meat | 0.736699 | 0.231632 | -0.278022 | 0.242745 | 0.015310 | -0.026609 | 1.000000 | 0.150011 | 0.229228 | -0.241335 | ... | -0.146436 | -0.019060 | 0.258219 | -0.261477 | 0.223089 | 0.099342 | -0.146077 | -0.189613 | 0.384123 | 0.345947 |
| Milk - Excluding Butter | 0.634589 | 0.342875 | -0.276145 | 0.273953 | -0.272491 | -0.062986 | 0.150011 | 1.000000 | 0.050456 | -0.416321 | ... | -0.340831 | 0.307515 | 0.312060 | -0.064060 | 0.307814 | 0.297828 | -0.302312 | -0.118948 | 0.439168 | 0.371274 |
| Offals | 0.065823 | -0.179648 | 0.281311 | -0.122609 | -0.091862 | 0.075238 | 0.229228 | 0.050456 | 1.000000 | -0.012253 | ... | 0.306247 | 0.120770 | -0.160281 | -0.009059 | -0.184620 | -0.339397 | 0.283079 | -0.036222 | -0.261051 | -0.244990 |
| Oilcrops | -0.436742 | -0.340163 | 0.113556 | -0.342333 | 0.346553 | 0.050115 | -0.241335 | -0.416321 | -0.012253 | 1.000000 | ... | 0.132772 | -0.010549 | -0.212996 | 0.040473 | -0.244861 | -0.186438 | 0.110726 | -0.017289 | -0.377121 | -0.349467 |
| Pulses | -0.425094 | -0.315595 | 0.406925 | -0.331933 | -0.093786 | 0.483089 | -0.318487 | -0.190892 | 0.056014 | 0.149362 | ... | 0.288155 | -0.035522 | -0.351309 | 0.288831 | -0.110936 | -0.118573 | 0.185870 | -0.003634 | -0.511181 | -0.472831 |
| Spices | -0.183738 | -0.200584 | 0.123360 | -0.002724 | 0.226539 | 0.011036 | -0.162154 | -0.084517 | -0.090423 | 0.110913 | ... | 0.058863 | -0.048218 | -0.165202 | 0.138913 | 0.015188 | 0.110768 | -0.040758 | 0.025912 | -0.136138 | -0.138357 |
| Starchy Roots | -0.389973 | -0.303520 | 0.210341 | -0.351431 | 0.169265 | 0.440201 | -0.170990 | -0.387129 | 0.132172 | 0.298163 | ... | 0.420362 | -0.045116 | -0.349777 | 0.145597 | -0.211763 | -0.295134 | 0.302615 | 0.121661 | -0.448664 | -0.341699 |
| Stimulants | 0.507705 | 0.280966 | -0.264135 | 0.278077 | 0.005564 | -0.089096 | 0.306149 | 0.474636 | -0.027747 | -0.286485 | ... | -0.299139 | -0.022776 | 0.246330 | -0.205219 | 0.263951 | 0.226711 | -0.253963 | -0.145858 | 0.455781 | 0.380227 |
| Treenuts | 0.159286 | 0.161809 | -0.205887 | 0.292638 | 0.160767 | -0.091491 | -0.056260 | 0.196338 | -0.146111 | -0.214664 | ... | -0.304072 | 0.022282 | 0.247513 | -0.080001 | 0.193674 | 0.218086 | -0.170558 | 0.024855 | 0.301874 | 0.279961 |
| Vegetal Products | -1.000000 | -0.696306 | 0.464931 | -0.470465 | 0.021678 | 0.112282 | -0.736701 | -0.634573 | -0.065814 | 0.436730 | ... | 0.401972 | -0.113612 | -0.427149 | 0.204334 | -0.385817 | -0.286321 | 0.374038 | 0.175868 | -0.689133 | -0.616118 |
| Vegetable Oils | -0.662161 | -0.369446 | 0.005906 | -0.197051 | -0.249428 | -0.070143 | -0.558249 | -0.361765 | -0.197600 | -0.226120 | ... | 0.218357 | -0.112548 | -0.151018 | 0.133938 | -0.125381 | -0.111301 | 0.159511 | 0.190629 | -0.261272 | -0.221994 |
| Vegetables | 0.083535 | -0.083130 | 0.044259 | 0.164613 | -0.008991 | 0.032403 | 0.003719 | 0.246313 | 0.081951 | -0.131136 | ... | -0.041244 | 0.138704 | 0.037330 | 0.045392 | 0.066256 | 0.063391 | -0.160218 | 0.093412 | 0.034154 | -0.029775 |
| Obesity | 0.430380 | 0.391204 | -0.497187 | 0.309372 | -0.158797 | -0.092355 | 0.271295 | 0.272539 | -0.251688 | -0.157551 | ... | -0.467360 | 0.005017 | 0.770613 | -0.322088 | 0.304861 | 0.298984 | -0.373918 | -0.075649 | 0.692220 | 0.633168 |
| Confirmed | 0.357898 | 0.353236 | -0.362804 | 0.291238 | 0.053155 | -0.056920 | 0.151191 | 0.238475 | -0.171835 | -0.247925 | ... | -0.392181 | -0.040137 | 0.418583 | -0.156785 | 0.261332 | 0.169787 | -0.227340 | -0.017993 | 0.531878 | 0.471506 |
| Deaths | 0.226872 | 0.329210 | -0.295061 | 0.129442 | -0.025360 | -0.065152 | 0.032719 | 0.155998 | -0.173631 | -0.178322 | ... | -0.294966 | -0.029003 | 0.314081 | -0.095783 | 0.170265 | 0.138092 | -0.153854 | 0.021039 | 0.396425 | 0.381687 |
| Recovered | 0.338004 | 0.318047 | -0.288803 | 0.187248 | -0.009306 | -0.048483 | 0.178391 | 0.227385 | -0.145488 | -0.224722 | ... | -0.337419 | -0.024235 | 0.360196 | -0.128309 | 0.211297 | 0.157846 | -0.187125 | -0.007453 | 0.472942 | 0.423891 |
| Active | 0.213949 | 0.214089 | -0.278007 | 0.301388 | 0.121780 | -0.036138 | 0.053351 | 0.138807 | -0.114165 | -0.157233 | ... | -0.270716 | -0.041738 | 0.288604 | -0.120871 | 0.203096 | 0.099637 | -0.171249 | -0.031949 | 0.343856 | 0.292468 |
| Population_x | 0.000845 | 0.019125 | -0.005294 | 0.141391 | -0.012514 | -0.043501 | 0.006982 | -0.053587 | 0.083154 | -0.033425 | ... | -0.009411 | -0.021705 | -0.109352 | 0.687921 | 0.006104 | 0.016436 | -0.044615 | 0.118447 | -0.031850 | -0.041698 |
| Year | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Life expectancy | 0.627147 | 0.538355 | -0.553566 | 0.564881 | 0.046156 | -0.036785 | 0.312417 | 0.435685 | -0.296445 | -0.344197 | ... | -0.769827 | 0.044123 | 0.539841 | -0.270825 | 0.524219 | 0.455457 | -0.611583 | -0.064462 | 0.907585 | 0.819553 |
| Adult Mortality | -0.401995 | -0.353027 | 0.400019 | -0.369550 | -0.044788 | 0.027664 | -0.146436 | -0.340831 | 0.306247 | 0.132772 | ... | 1.000000 | -0.043049 | -0.394249 | 0.209851 | -0.402464 | -0.286336 | 0.664830 | 0.057692 | -0.635897 | -0.532971 |
| percentage expenditure | 0.113623 | -0.008614 | -0.025796 | 0.076464 | -0.071822 | 0.006149 | -0.019060 | 0.307515 | 0.120770 | -0.010549 | ... | -0.043049 | 1.000000 | 0.037973 | -0.015977 | -0.000456 | 0.035585 | -0.038246 | -0.020951 | 0.011441 | 0.014842 |
| BMI | 0.427160 | 0.369455 | -0.429014 | 0.230078 | -0.130329 | -0.068760 | 0.258219 | 0.312060 | -0.160281 | -0.212996 | ... | -0.394249 | 0.037973 | 1.000000 | -0.236015 | 0.288257 | 0.217996 | -0.299818 | -0.006142 | 0.617654 | 0.592342 |
| under-five deaths | -0.204313 | -0.039779 | 0.174882 | -0.109621 | -0.063458 | -0.039160 | -0.261477 | -0.064060 | -0.009059 | 0.040473 | ... | 0.209851 | -0.015977 | -0.236015 | 1.000000 | -0.164226 | -0.153950 | 0.134492 | 0.308069 | -0.247259 | -0.250599 |
| Polio | 0.385766 | 0.279839 | -0.423820 | 0.259077 | -0.013412 | 0.034929 | 0.223089 | 0.307814 | -0.184620 | -0.244861 | ... | -0.402464 | -0.000456 | 0.288257 | -0.164226 | 1.000000 | 0.646003 | -0.433104 | -0.267904 | 0.494363 | 0.409904 |
| Diphtheria | 0.286291 | 0.239359 | -0.275711 | 0.208234 | -0.044516 | -0.028214 | 0.099342 | 0.297828 | -0.339397 | -0.186438 | ... | -0.286336 | 0.035585 | 0.217996 | -0.153950 | 0.646003 | 1.000000 | -0.341930 | -0.081235 | 0.425759 | 0.385533 |
| HIV/AIDS | -0.374073 | -0.313355 | 0.511248 | -0.381427 | -0.075533 | -0.047952 | -0.146077 | -0.302312 | 0.283079 | 0.110726 | ... | 0.664830 | -0.038246 | -0.299818 | 0.134492 | -0.433104 | -0.341930 | 1.000000 | 0.044697 | -0.492405 | -0.393279 |
| Population_y | -0.175792 | -0.083317 | 0.063692 | 0.056293 | 0.033109 | 0.022113 | -0.189613 | -0.118948 | -0.036222 | -0.017289 | ... | 0.057692 | -0.020951 | -0.006142 | 0.308069 | -0.267904 | -0.081235 | 0.044697 | 1.000000 | -0.004053 | 0.012996 |
| Income composition of resources | 0.689126 | 0.594815 | -0.629482 | 0.562982 | 0.003294 | -0.096971 | 0.384123 | 0.439168 | -0.261051 | -0.377121 | ... | -0.635897 | 0.011441 | 0.617654 | -0.247259 | 0.494363 | 0.425759 | -0.492405 | -0.004053 | 1.000000 | 0.926394 |
| Schooling | 0.616102 | 0.561922 | -0.569581 | 0.457319 | -0.006498 | -0.061888 | 0.345947 | 0.371274 | -0.244990 | -0.349467 | ... | -0.532971 | 0.014842 | 0.592342 | -0.250599 | 0.409904 | 0.385533 | -0.393279 | 0.012996 | 0.926394 | 1.000000 |
36 rows × 36 columns
ProfileReport(covid_df)
Dataset info
| Number of variables | 39 |
|---|---|
| Number of observations | 157 |
| Total Missing (%) | 0.8% |
| Total size in memory | 49.1 KiB |
| Average record size in memory | 320.0 B |
Variables types
| Numeric | 33 |
|---|---|
| Categorical | 1 |
| Boolean | 0 |
| Date | 0 |
| Text (Unique) | 1 |
| Rejected | 4 |
| Unsupported | 0 |
Warnings
Offals has 2 / 1.3% zeros ZerosPulses has 7 / 4.5% zeros ZerosSpices has 10 / 6.4% zeros ZerosStimulants has 2 / 1.3% zeros ZerosTreenuts has 8 / 5.1% zeros ZerosConfirmed has 6 / 3.8% missing values MissingDeaths has 6 / 3.8% missing values MissingDeaths has 14 / 8.9% zeros ZerosRecovered has 6 / 3.8% missing values MissingActive has 6 / 3.8% missing values MissingActive has 5 / 3.2% zeros ZerosUnit (all except Population) has constant value % RejectedYear has constant value 2015 Rejectedpercentage expenditure has 155 / 98.7% zeros Zerosunder-five deaths has 45 / 28.7% zeros ZerosPopulation_y has 25 / 15.9% missing values MissingIncome composition of resources is highly correlated with Life expectancy (ρ = 0.90758) RejectedSchooling is highly correlated with Income composition of resources (ρ = 0.92639) RejectedCountry
Categorical, Unique
| First 3 values |
|---|
| Argentina |
| Peru |
| Cyprus |
| Last 3 values |
|---|
| Gabon |
| France |
| Djibouti |
First 10 values
| Value | Count | Frequency (%) | |
| Afghanistan | 1 | 0.6% |
|
| Albania | 1 | 0.6% |
|
| Algeria | 1 | 0.6% |
|
| Angola | 1 | 0.6% |
|
| Antigua and Barbuda | 1 | 0.6% |
|
Last 10 values
| Value | Count | Frequency (%) | |
| Vanuatu | 1 | 0.6% |
|
| Venezuela (Bolivarian Republic of) | 1 | 0.6% |
|
| Yemen | 1 | 0.6% |
|
| Zambia | 1 | 0.6% |
|
| Zimbabwe | 1 | 0.6% |
|
Animal Products
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 20.595 |
|---|---|
| Minimum | 5.0182 |
| Maximum | 36.902 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 5.0182 |
|---|---|
| 5-th percentile | 7.0546 |
| Q1 | 14.418 |
| Median | 20.928 |
| Q3 | 26.91 |
| 95-th percentile | 32.835 |
| Maximum | 36.902 |
| Range | 31.884 |
| Interquartile range | 12.492 |
Descriptive statistics
| Standard deviation | 8.0638 |
|---|---|
| Coef of variation | 0.39154 |
| Kurtosis | -0.92571 |
| Mean | 20.595 |
| MAD | 6.8101 |
| Skewness | -0.066656 |
| Sum | 3233.5 |
| Variance | 65.026 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 10.8323 | 1 | 0.6% |
|
| 25.9903 | 1 | 0.6% |
|
| 28.4111 | 1 | 0.6% |
|
| 26.6163 | 1 | 0.6% |
|
| 17.4631 | 1 | 0.6% |
|
| 32.6886 | 1 | 0.6% |
|
| 19.0932 | 1 | 0.6% |
|
| 25.8451 | 1 | 0.6% |
|
| 26.7378 | 1 | 0.6% |
|
| 20.5571 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 5.0182 | 1 | 0.6% |
|
| 5.3063 | 1 | 0.6% |
|
| 5.9931 | 1 | 0.6% |
|
| 6.0418 | 1 | 0.6% |
|
| 6.0747 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 34.1264 | 1 | 0.6% |
|
| 34.4402 | 1 | 0.6% |
|
| 35.4131 | 1 | 0.6% |
|
| 36.725 | 1 | 0.6% |
|
| 36.9018 | 1 | 0.6% |
|
Animal fats
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 4.1841 |
|---|---|
| Minimum | 0.0348 |
| Maximum | 14.937 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0348 |
|---|---|
| 5-th percentile | 0.53388 |
| Q1 | 1.6113 |
| Median | 3.3013 |
| Q3 | 6.3787 |
| 95-th percentile | 10.556 |
| Maximum | 14.937 |
| Range | 14.902 |
| Interquartile range | 4.7674 |
Descriptive statistics
| Standard deviation | 3.379 |
|---|---|
| Coef of variation | 0.80759 |
| Kurtosis | 0.64579 |
| Mean | 4.1841 |
| MAD | 2.7235 |
| Skewness | 1.0943 |
| Sum | 656.9 |
| Variance | 11.418 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 3.3076 | 2 | 1.3% |
|
| 1.8698 | 1 | 0.6% |
|
| 0.6241 | 1 | 0.6% |
|
| 2.9392 | 1 | 0.6% |
|
| 6.2224 | 1 | 0.6% |
|
| 6.1 | 1 | 0.6% |
|
| 0.4154 | 1 | 0.6% |
|
| 9.8102 | 1 | 0.6% |
|
| 3.0066 | 1 | 0.6% |
|
| 3.4472 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0348 | 1 | 0.6% |
|
| 0.1678 | 1 | 0.6% |
|
| 0.2548 | 1 | 0.6% |
|
| 0.3195 | 1 | 0.6% |
|
| 0.33899999999999997 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 12.6234 | 1 | 0.6% |
|
| 12.8517 | 1 | 0.6% |
|
| 13.9753 | 1 | 0.6% |
|
| 14.2498 | 1 | 0.6% |
|
| 14.9373 | 1 | 0.6% |
|
Cereals - Excluding Beer
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 4.4243 |
|---|---|
| Minimum | 0.9908 |
| Maximum | 18.376 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.9908 |
|---|---|
| 5-th percentile | 1.2728 |
| Q1 | 2.0428 |
| Median | 3.3592 |
| Q3 | 5.6789 |
| 95-th percentile | 10.72 |
| Maximum | 18.376 |
| Range | 17.386 |
| Interquartile range | 3.6361 |
Descriptive statistics
| Standard deviation | 3.1996 |
|---|---|
| Coef of variation | 0.72317 |
| Kurtosis | 2.433 |
| Mean | 4.4243 |
| MAD | 2.4375 |
| Skewness | 1.5249 |
| Sum | 694.62 |
| Variance | 10.237 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 4.9807 | 1 | 0.6% |
|
| 4.3092 | 1 | 0.6% |
|
| 13.0891 | 1 | 0.6% |
|
| 1.078 | 1 | 0.6% |
|
| 1.4883 | 1 | 0.6% |
|
| 2.0003 | 1 | 0.6% |
|
| 3.4479 | 1 | 0.6% |
|
| 7.1664 | 1 | 0.6% |
|
| 7.986000000000001 | 1 | 0.6% |
|
| 3.8355 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.9908 | 1 | 0.6% |
|
| 1.078 | 1 | 0.6% |
|
| 1.1241 | 1 | 0.6% |
|
| 1.1278 | 1 | 0.6% |
|
| 1.1778 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 12.436 | 1 | 0.6% |
|
| 13.0891 | 1 | 0.6% |
|
| 13.4988 | 1 | 0.6% |
|
| 14.3225 | 1 | 0.6% |
|
| 18.3763 | 1 | 0.6% |
|
Eggs
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.95439 |
|---|---|
| Minimum | 0.058 |
| Maximum | 3.2756 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.058 |
|---|---|
| 5-th percentile | 0.13722 |
| Q1 | 0.3682 |
| Median | 0.8991 |
| Q3 | 1.2664 |
| 95-th percentile | 2.2859 |
| Maximum | 3.2756 |
| Range | 3.2176 |
| Interquartile range | 0.8982 |
Descriptive statistics
| Standard deviation | 0.65851 |
|---|---|
| Coef of variation | 0.68998 |
| Kurtosis | 1.2164 |
| Mean | 0.95439 |
| MAD | 0.51167 |
| Skewness | 1.0108 |
| Sum | 149.84 |
| Variance | 0.43364 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.8991 | 2 | 1.3% |
|
| 0.8448 | 1 | 0.6% |
|
| 0.32899999999999996 | 1 | 0.6% |
|
| 1.3259 | 1 | 0.6% |
|
| 2.4481 | 1 | 0.6% |
|
| 0.825 | 1 | 0.6% |
|
| 1.7484 | 1 | 0.6% |
|
| 0.5249 | 1 | 0.6% |
|
| 0.6203 | 1 | 0.6% |
|
| 1.5706 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.057999999999999996 | 1 | 0.6% |
|
| 0.0701 | 1 | 0.6% |
|
| 0.0744 | 1 | 0.6% |
|
| 0.0746 | 1 | 0.6% |
|
| 0.1074 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2.6186 | 1 | 0.6% |
|
| 2.7596 | 1 | 0.6% |
|
| 2.8961 | 1 | 0.6% |
|
| 3.1241 | 1 | 0.6% |
|
| 3.2756 | 1 | 0.6% |
|
Fish, Seafood
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.84838 |
|---|---|
| Minimum | 0.0174 |
| Maximum | 8.4068 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0174 |
|---|---|
| 5-th percentile | 0.09486 |
| Q1 | 0.3245 |
| Median | 0.5708 |
| Q3 | 1.0457 |
| 95-th percentile | 2.3237 |
| Maximum | 8.4068 |
| Range | 8.3894 |
| Interquartile range | 0.7212 |
Descriptive statistics
| Standard deviation | 0.94856 |
|---|---|
| Coef of variation | 1.1181 |
| Kurtosis | 26.937 |
| Mean | 0.84838 |
| MAD | 0.58949 |
| Skewness | 4.1429 |
| Sum | 133.19 |
| Variance | 0.89977 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.4515 | 1 | 0.6% |
|
| 0.5746 | 1 | 0.6% |
|
| 0.1302 | 1 | 0.6% |
|
| 3.2666 | 1 | 0.6% |
|
| 0.5633 | 1 | 0.6% |
|
| 0.4514 | 1 | 0.6% |
|
| 0.7451 | 1 | 0.6% |
|
| 0.0962 | 1 | 0.6% |
|
| 1.169 | 1 | 0.6% |
|
| 0.1482 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0174 | 1 | 0.6% |
|
| 0.0315 | 1 | 0.6% |
|
| 0.0327 | 1 | 0.6% |
|
| 0.0559 | 1 | 0.6% |
|
| 0.0587 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2.7774 | 1 | 0.6% |
|
| 3.0833 | 1 | 0.6% |
|
| 3.2666 | 1 | 0.6% |
|
| 4.8461 | 1 | 0.6% |
|
| 8.4068 | 1 | 0.6% |
|
Fruits - Excluding Wine
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.54967 |
|---|---|
| Minimum | 0.0373 |
| Maximum | 9.6727 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0373 |
|---|---|
| 5-th percentile | 0.08512 |
| Q1 | 0.2388 |
| Median | 0.3556 |
| Q3 | 0.5791 |
| 95-th percentile | 1.3878 |
| Maximum | 9.6727 |
| Range | 9.6354 |
| Interquartile range | 0.3403 |
Descriptive statistics
| Standard deviation | 0.86907 |
|---|---|
| Coef of variation | 1.5811 |
| Kurtosis | 79.295 |
| Mean | 0.54967 |
| MAD | 0.37635 |
| Skewness | 7.968 |
| Sum | 86.298 |
| Variance | 0.75529 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.2987 | 2 | 1.3% |
|
| 1.5030000000000001 | 1 | 0.6% |
|
| 1.2516 | 1 | 0.6% |
|
| 0.0614 | 1 | 0.6% |
|
| 0.7014 | 1 | 0.6% |
|
| 1.2151 | 1 | 0.6% |
|
| 0.5791 | 1 | 0.6% |
|
| 0.1339 | 1 | 0.6% |
|
| 0.2938 | 1 | 0.6% |
|
| 0.3433 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0373 | 1 | 0.6% |
|
| 0.042 | 1 | 0.6% |
|
| 0.0443 | 1 | 0.6% |
|
| 0.0614 | 1 | 0.6% |
|
| 0.0624 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 1.6382 | 1 | 0.6% |
|
| 1.6804 | 1 | 0.6% |
|
| 2.8436 | 1 | 0.6% |
|
| 3.6133 | 1 | 0.6% |
|
| 9.6727 | 1 | 0.6% |
|
Meat
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 9.2145 |
|---|---|
| Minimum | 0.9061 |
| Maximum | 22.878 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.9061 |
|---|---|
| 5-th percentile | 2.9911 |
| Q1 | 6.1244 |
| Median | 9.0171 |
| Q3 | 11.558 |
| 95-th percentile | 17.65 |
| Maximum | 22.878 |
| Range | 21.972 |
| Interquartile range | 5.4337 |
Descriptive statistics
| Standard deviation | 4.4521 |
|---|---|
| Coef of variation | 0.48317 |
| Kurtosis | 0.31683 |
| Mean | 9.2145 |
| MAD | 3.4562 |
| Skewness | 0.62422 |
| Sum | 1446.7 |
| Variance | 19.822 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 3.3685 | 1 | 0.6% |
|
| 11.5581 | 1 | 0.6% |
|
| 9.4166 | 1 | 0.6% |
|
| 8.6212 | 1 | 0.6% |
|
| 9.7764 | 1 | 0.6% |
|
| 11.5636 | 1 | 0.6% |
|
| 2.4838 | 1 | 0.6% |
|
| 9.6514 | 1 | 0.6% |
|
| 6.7594 | 1 | 0.6% |
|
| 2.8993 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.9061 | 1 | 0.6% |
|
| 1.3488 | 1 | 0.6% |
|
| 1.8407 | 1 | 0.6% |
|
| 2.0269999999999997 | 1 | 0.6% |
|
| 2.0412 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 19.2693 | 1 | 0.6% |
|
| 20.2172 | 1 | 0.6% |
|
| 21.0223 | 1 | 0.6% |
|
| 21.6062 | 1 | 0.6% |
|
| 22.8778 | 1 | 0.6% |
|
Milk - Excluding Butter
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 5.2451 |
|---|---|
| Minimum | 0.1779 |
| Maximum | 17.758 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.1779 |
|---|---|
| 5-th percentile | 0.56352 |
| Q1 | 2.2937 |
| Median | 5.1279 |
| Q3 | 7.4411 |
| 95-th percentile | 11.056 |
| Maximum | 17.758 |
| Range | 17.58 |
| Interquartile range | 5.1474 |
Descriptive statistics
| Standard deviation | 3.3636 |
|---|---|
| Coef of variation | 0.64127 |
| Kurtosis | 0.46226 |
| Mean | 5.2451 |
| MAD | 2.6747 |
| Skewness | 0.65741 |
| Sum | 823.49 |
| Variance | 11.314 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 5.86 | 2 | 1.3% |
|
| 5.9146 | 1 | 0.6% |
|
| 5.2308 | 1 | 0.6% |
|
| 7.4043 | 1 | 0.6% |
|
| 10.3934 | 1 | 0.6% |
|
| 11.5125 | 1 | 0.6% |
|
| 2.2937 | 1 | 0.6% |
|
| 4.8443 | 1 | 0.6% |
|
| 4.548 | 1 | 0.6% |
|
| 8.3355 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.1779 | 1 | 0.6% |
|
| 0.2243 | 1 | 0.6% |
|
| 0.2438 | 1 | 0.6% |
|
| 0.2553 | 1 | 0.6% |
|
| 0.4024 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 11.8155 | 1 | 0.6% |
|
| 12.5363 | 1 | 0.6% |
|
| 14.2068 | 1 | 0.6% |
|
| 14.275 | 1 | 0.6% |
|
| 17.7576 | 1 | 0.6% |
|
Offals
Numeric
| Distinct count | 154 |
|---|---|
| Unique (%) | 98.1% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.14842 |
|---|---|
| Minimum | 0 |
| Maximum | 0.7268 |
| Zeros (%) | 1.3% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.01812 |
| Q1 | 0.0763 |
| Median | 0.1221 |
| Q3 | 0.189 |
| 95-th percentile | 0.33644 |
| Maximum | 0.7268 |
| Range | 0.7268 |
| Interquartile range | 0.1127 |
Descriptive statistics
| Standard deviation | 0.11534 |
|---|---|
| Coef of variation | 0.77713 |
| Kurtosis | 6.8648 |
| Mean | 0.14842 |
| MAD | 0.081521 |
| Skewness | 2.1107 |
| Sum | 23.302 |
| Variance | 0.013304 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.1563 | 2 | 1.3% |
|
| 0.0 | 2 | 1.3% |
|
| 0.09699999999999999 | 2 | 1.3% |
|
| 0.2145 | 1 | 0.6% |
|
| 0.1 | 1 | 0.6% |
|
| 0.1178 | 1 | 0.6% |
|
| 0.055 | 1 | 0.6% |
|
| 0.2175 | 1 | 0.6% |
|
| 0.0288 | 1 | 0.6% |
|
| 0.1627 | 1 | 0.6% |
|
| Other values (144) | 144 | 91.7% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 2 | 1.3% |
|
| 0.0033 | 1 | 0.6% |
|
| 0.0065 | 1 | 0.6% |
|
| 0.0074 | 1 | 0.6% |
|
| 0.0078 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.4322 | 1 | 0.6% |
|
| 0.4784 | 1 | 0.6% |
|
| 0.5957 | 1 | 0.6% |
|
| 0.6717 | 1 | 0.6% |
|
| 0.7268 | 1 | 0.6% |
|
Oilcrops
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 3.3572 |
|---|---|
| Minimum | 0.064 |
| Maximum | 28.564 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.064 |
|---|---|
| 5-th percentile | 0.35396 |
| Q1 | 0.7686 |
| Median | 1.566 |
| Q3 | 3.4389 |
| 95-th percentile | 12.664 |
| Maximum | 28.564 |
| Range | 28.5 |
| Interquartile range | 2.6703 |
Descriptive statistics
| Standard deviation | 4.8372 |
|---|---|
| Coef of variation | 1.4408 |
| Kurtosis | 10.419 |
| Mean | 3.3572 |
| MAD | 3.0245 |
| Skewness | 3.0302 |
| Sum | 527.08 |
| Variance | 23.398 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.8931 | 1 | 0.6% |
|
| 1.9462 | 1 | 0.6% |
|
| 3.3157 | 1 | 0.6% |
|
| 1.4978 | 1 | 0.6% |
|
| 13.4216 | 1 | 0.6% |
|
| 3.0787 | 1 | 0.6% |
|
| 3.1993 | 1 | 0.6% |
|
| 3.6545 | 1 | 0.6% |
|
| 0.8909 | 1 | 0.6% |
|
| 0.6488 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.064 | 1 | 0.6% |
|
| 0.0895 | 1 | 0.6% |
|
| 0.1003 | 1 | 0.6% |
|
| 0.1007 | 1 | 0.6% |
|
| 0.1259 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 16.8666 | 1 | 0.6% |
|
| 20.7704 | 1 | 0.6% |
|
| 23.1779 | 1 | 0.6% |
|
| 27.1892 | 1 | 0.6% |
|
| 28.5639 | 1 | 0.6% |
|
Pulses
Numeric
| Distinct count | 148 |
|---|---|
| Unique (%) | 94.3% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.27069 |
|---|---|
| Minimum | 0 |
| Maximum | 2.6909 |
| Zeros (%) | 4.5% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.00868 |
| Q1 | 0.0427 |
| Median | 0.1483 |
| Q3 | 0.3794 |
| 95-th percentile | 0.85764 |
| Maximum | 2.6909 |
| Range | 2.6909 |
| Interquartile range | 0.3367 |
Descriptive statistics
| Standard deviation | 0.37877 |
|---|---|
| Coef of variation | 1.3993 |
| Kurtosis | 18.259 |
| Mean | 0.27069 |
| MAD | 0.24108 |
| Skewness | 3.6127 |
| Sum | 42.498 |
| Variance | 0.14347 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 7 | 4.5% |
|
| 0.0658 | 2 | 1.3% |
|
| 0.0353 | 2 | 1.3% |
|
| 0.012 | 2 | 1.3% |
|
| 0.1365 | 1 | 0.6% |
|
| 0.2034 | 1 | 0.6% |
|
| 0.0427 | 1 | 0.6% |
|
| 0.0152 | 1 | 0.6% |
|
| 0.2175 | 1 | 0.6% |
|
| 0.2743 | 1 | 0.6% |
|
| Other values (138) | 138 | 87.9% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 7 | 4.5% |
|
| 0.0078 | 1 | 0.6% |
|
| 0.0089 | 1 | 0.6% |
|
| 0.0101 | 1 | 0.6% |
|
| 0.0105 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 1.0624 | 1 | 0.6% |
|
| 1.1084 | 1 | 0.6% |
|
| 1.4398 | 1 | 0.6% |
|
| 2.5545 | 1 | 0.6% |
|
| 2.6909 | 1 | 0.6% |
|
Spices
Numeric
| Distinct count | 143 |
|---|---|
| Unique (%) | 91.1% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.28085 |
|---|---|
| Minimum | 0 |
| Maximum | 2.6851 |
| Zeros (%) | 6.4% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.0377 |
| Median | 0.0994 |
| Q3 | 0.3181 |
| 95-th percentile | 1.1925 |
| Maximum | 2.6851 |
| Range | 2.6851 |
| Interquartile range | 0.2804 |
Descriptive statistics
| Standard deviation | 0.46074 |
|---|---|
| Coef of variation | 1.6406 |
| Kurtosis | 10.247 |
| Mean | 0.28085 |
| MAD | 0.29841 |
| Skewness | 2.9727 |
| Sum | 44.093 |
| Variance | 0.21228 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 10 | 6.4% |
|
| 0.0836 | 2 | 1.3% |
|
| 0.1235 | 2 | 1.3% |
|
| 0.043 | 2 | 1.3% |
|
| 0.0336 | 2 | 1.3% |
|
| 0.0103 | 2 | 1.3% |
|
| 0.1697 | 1 | 0.6% |
|
| 0.0414 | 1 | 0.6% |
|
| 0.0319 | 1 | 0.6% |
|
| 0.0285 | 1 | 0.6% |
|
| Other values (133) | 133 | 84.7% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 10 | 6.4% |
|
| 0.0052 | 1 | 0.6% |
|
| 0.0058 | 1 | 0.6% |
|
| 0.0079 | 1 | 0.6% |
|
| 0.008 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 1.4302 | 1 | 0.6% |
|
| 1.7594 | 1 | 0.6% |
|
| 2.2196 | 1 | 0.6% |
|
| 2.597 | 1 | 0.6% |
|
| 2.6851 | 1 | 0.6% |
|
Starchy Roots
Numeric
| Distinct count | 154 |
|---|---|
| Unique (%) | 98.1% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.2263 |
|---|---|
| Minimum | 0.0124 |
| Maximum | 2.1778 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0124 |
|---|---|
| 5-th percentile | 0.02806 |
| Q1 | 0.0481 |
| Median | 0.0877 |
| Q3 | 0.1985 |
| 95-th percentile | 0.94442 |
| Maximum | 2.1778 |
| Range | 2.1654 |
| Interquartile range | 0.1504 |
Descriptive statistics
| Standard deviation | 0.36724 |
|---|---|
| Coef of variation | 1.6228 |
| Kurtosis | 13.003 |
| Mean | 0.2263 |
| MAD | 0.22338 |
| Skewness | 3.3642 |
| Sum | 35.528 |
| Variance | 0.13486 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0513 | 2 | 1.3% |
|
| 0.0329 | 2 | 1.3% |
|
| 0.0565 | 2 | 1.3% |
|
| 1.0806 | 1 | 0.6% |
|
| 0.1103 | 1 | 0.6% |
|
| 1.0609 | 1 | 0.6% |
|
| 0.0697 | 1 | 0.6% |
|
| 0.8018 | 1 | 0.6% |
|
| 0.0965 | 1 | 0.6% |
|
| 0.0516 | 1 | 0.6% |
|
| Other values (144) | 144 | 91.7% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0124 | 1 | 0.6% |
|
| 0.0168 | 1 | 0.6% |
|
| 0.0207 | 1 | 0.6% |
|
| 0.0217 | 1 | 0.6% |
|
| 0.0247 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 1.2621 | 1 | 0.6% |
|
| 1.3555 | 1 | 0.6% |
|
| 2.0087 | 1 | 0.6% |
|
| 2.1636 | 1 | 0.6% |
|
| 2.1778 | 1 | 0.6% |
|
Stimulants
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.62499 |
|---|---|
| Minimum | 0 |
| Maximum | 3.3838 |
| Zeros (%) | 1.3% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.02604 |
| Q1 | 0.1128 |
| Median | 0.3788 |
| Q3 | 0.8506 |
| 95-th percentile | 2.2073 |
| Maximum | 3.3838 |
| Range | 3.3838 |
| Interquartile range | 0.7378 |
Descriptive statistics
| Standard deviation | 0.69622 |
|---|---|
| Coef of variation | 1.114 |
| Kurtosis | 2.6621 |
| Mean | 0.62499 |
| MAD | 0.52418 |
| Skewness | 1.6985 |
| Sum | 98.123 |
| Variance | 0.48473 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 2 | 1.3% |
|
| 1.6874 | 1 | 0.6% |
|
| 0.1213 | 1 | 0.6% |
|
| 1.5137 | 1 | 0.6% |
|
| 0.4161 | 1 | 0.6% |
|
| 2.0044 | 1 | 0.6% |
|
| 1.7184 | 1 | 0.6% |
|
| 0.8266 | 1 | 0.6% |
|
| 0.0671 | 1 | 0.6% |
|
| 0.7926 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 2 | 1.3% |
|
| 0.0176 | 1 | 0.6% |
|
| 0.0186 | 1 | 0.6% |
|
| 0.0193 | 1 | 0.6% |
|
| 0.0204 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2.6726 | 1 | 0.6% |
|
| 2.6783 | 1 | 0.6% |
|
| 2.7774 | 1 | 0.6% |
|
| 2.7855 | 1 | 0.6% |
|
| 3.3838 | 1 | 0.6% |
|
Treenuts
Numeric
| Distinct count | 149 |
|---|---|
| Unique (%) | 94.9% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.69474 |
|---|---|
| Minimum | 0 |
| Maximum | 4.9756 |
| Zeros (%) | 5.1% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.00552 |
| Q1 | 0.1366 |
| Median | 0.4339 |
| Q3 | 0.9018 |
| 95-th percentile | 2.0332 |
| Maximum | 4.9756 |
| Range | 4.9756 |
| Interquartile range | 0.7652 |
Descriptive statistics
| Standard deviation | 0.83028 |
|---|---|
| Coef of variation | 1.1951 |
| Kurtosis | 9.1961 |
| Mean | 0.69474 |
| MAD | 0.56087 |
| Skewness | 2.6207 |
| Sum | 109.07 |
| Variance | 0.68937 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 8 | 5.1% |
|
| 1.3357 | 2 | 1.3% |
|
| 0.3628 | 1 | 0.6% |
|
| 0.3404 | 1 | 0.6% |
|
| 0.7754 | 1 | 0.6% |
|
| 1.181 | 1 | 0.6% |
|
| 1.3194 | 1 | 0.6% |
|
| 0.9355 | 1 | 0.6% |
|
| 0.9704 | 1 | 0.6% |
|
| 0.853 | 1 | 0.6% |
|
| Other values (139) | 139 | 88.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 8 | 5.1% |
|
| 0.0069 | 1 | 0.6% |
|
| 0.0073 | 1 | 0.6% |
|
| 0.0111 | 1 | 0.6% |
|
| 0.0112 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 2.911 | 1 | 0.6% |
|
| 3.3116 | 1 | 0.6% |
|
| 3.8246 | 1 | 0.6% |
|
| 4.9044 | 1 | 0.6% |
|
| 4.9756 | 1 | 0.6% |
|
Vegetal Products
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 29.405 |
|---|---|
| Minimum | 13.098 |
| Maximum | 44.982 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 13.098 |
|---|---|
| 5-th percentile | 17.166 |
| Q1 | 23.09 |
| Median | 29.075 |
| Q3 | 35.586 |
| 95-th percentile | 42.95 |
| Maximum | 44.982 |
| Range | 31.884 |
| Interquartile range | 12.496 |
Descriptive statistics
| Standard deviation | 8.0636 |
|---|---|
| Coef of variation | 0.27423 |
| Kurtosis | -0.92577 |
| Mean | 29.405 |
| MAD | 6.8099 |
| Skewness | 0.066471 |
| Sum | 4616.5 |
| Variance | 65.021 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 33.2667 | 1 | 0.6% |
|
| 42.7706 | 1 | 0.6% |
|
| 31.4598 | 1 | 0.6% |
|
| 30.9068 | 1 | 0.6% |
|
| 24.1576 | 1 | 0.6% |
|
| 26.9158 | 1 | 0.6% |
|
| 23.2622 | 1 | 0.6% |
|
| 43.8301 | 1 | 0.6% |
|
| 28.3194 | 1 | 0.6% |
|
| 35.5857 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 13.0982 | 1 | 0.6% |
|
| 13.2732 | 1 | 0.6% |
|
| 14.585 | 1 | 0.6% |
|
| 15.5628 | 1 | 0.6% |
|
| 15.8736 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 43.9253 | 1 | 0.6% |
|
| 43.9582 | 1 | 0.6% |
|
| 44.0022 | 1 | 0.6% |
|
| 44.6892 | 1 | 0.6% |
|
| 44.9818 | 1 | 0.6% |
|
Vegetable Oils
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 18.613 |
|---|---|
| Minimum | 4.9549 |
| Maximum | 36.419 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 4.9549 |
|---|---|
| 5-th percentile | 8.3122 |
| Q1 | 13.868 |
| Median | 18.173 |
| Q3 | 23.554 |
| 95-th percentile | 30.025 |
| Maximum | 36.419 |
| Range | 31.464 |
| Interquartile range | 9.6857 |
Descriptive statistics
| Standard deviation | 6.7794 |
|---|---|
| Coef of variation | 0.36423 |
| Kurtosis | -0.60667 |
| Mean | 18.613 |
| MAD | 5.5039 |
| Skewness | 0.17244 |
| Sum | 2922.2 |
| Variance | 45.96 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 29.9945 | 1 | 0.6% |
|
| 18.4369 | 1 | 0.6% |
|
| 17.3147 | 1 | 0.6% |
|
| 21.215 | 1 | 0.6% |
|
| 14.4436 | 1 | 0.6% |
|
| 27.4593 | 1 | 0.6% |
|
| 14.1945 | 1 | 0.6% |
|
| 18.8819 | 1 | 0.6% |
|
| 18.8603 | 1 | 0.6% |
|
| 11.9281 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 4.9549 | 1 | 0.6% |
|
| 6.4849 | 1 | 0.6% |
|
| 6.7 | 1 | 0.6% |
|
| 7.1538 | 1 | 0.6% |
|
| 7.2939 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 30.643 | 1 | 0.6% |
|
| 31.449 | 1 | 0.6% |
|
| 33.0391 | 1 | 0.6% |
|
| 34.0479 | 1 | 0.6% |
|
| 36.4186 | 1 | 0.6% |
|
Vegetables
Numeric
| Distinct count | 156 |
|---|---|
| Unique (%) | 99.4% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.30309 |
|---|---|
| Minimum | 0.0263 |
| Maximum | 1.1538 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.0263 |
|---|---|
| 5-th percentile | 0.0793 |
| Q1 | 0.1729 |
| Median | 0.248 |
| Q3 | 0.3612 |
| 95-th percentile | 0.72558 |
| Maximum | 1.1538 |
| Range | 1.1275 |
| Interquartile range | 0.1883 |
Descriptive statistics
| Standard deviation | 0.20382 |
|---|---|
| Coef of variation | 0.67246 |
| Kurtosis | 3.3816 |
| Mean | 0.30309 |
| MAD | 0.14944 |
| Skewness | 1.7208 |
| Sum | 47.585 |
| Variance | 0.041542 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.1567 | 2 | 1.3% |
|
| 0.3066 | 1 | 0.6% |
|
| 0.2938 | 1 | 0.6% |
|
| 0.9395 | 1 | 0.6% |
|
| 0.2694 | 1 | 0.6% |
|
| 0.0665 | 1 | 0.6% |
|
| 0.1896 | 1 | 0.6% |
|
| 0.4514 | 1 | 0.6% |
|
| 0.2019 | 1 | 0.6% |
|
| 0.1305 | 1 | 0.6% |
|
| Other values (146) | 146 | 93.0% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0263 | 1 | 0.6% |
|
| 0.0431 | 1 | 0.6% |
|
| 0.0665 | 1 | 0.6% |
|
| 0.0738 | 1 | 0.6% |
|
| 0.0746 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.8535 | 1 | 0.6% |
|
| 0.8717 | 1 | 0.6% |
|
| 0.9395 | 1 | 0.6% |
|
| 1.1118 | 1 | 0.6% |
|
| 1.1538 | 1 | 0.6% |
|
Obesity
Numeric
| Distinct count | 113 |
|---|---|
| Unique (%) | 72.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 18.88 |
|---|---|
| Minimum | 2.9 |
| Maximum | 45.6 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 2.9 |
|---|---|
| 5-th percentile | 4.48 |
| Q1 | 8.6 |
| Median | 21.3 |
| Q3 | 25.7 |
| 95-th percentile | 32.02 |
| Maximum | 45.6 |
| Range | 42.7 |
| Interquartile range | 17.1 |
Descriptive statistics
| Standard deviation | 9.6162 |
|---|---|
| Coef of variation | 0.50933 |
| Kurtosis | -0.69653 |
| Mean | 18.88 |
| MAD | 8.1745 |
| Skewness | -0.018764 |
| Sum | 2964.2 |
| Variance | 92.472 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 4.5 | 4 | 2.5% |
|
| 25.7 | 4 | 2.5% |
|
| 27.4 | 3 | 1.9% |
|
| 7.1 | 3 | 1.9% |
|
| 6.0 | 3 | 1.9% |
|
| 19.4 | 2 | 1.3% |
|
| 8.2 | 2 | 1.3% |
|
| 22.3 | 2 | 1.3% |
|
| 23.8 | 2 | 1.3% |
|
| 26.6 | 2 | 1.3% |
|
| Other values (103) | 130 | 82.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 2.9 | 1 | 0.6% |
|
| 3.4 | 1 | 0.6% |
|
| 3.5 | 1 | 0.6% |
|
| 3.6 | 1 | 0.6% |
|
| 3.8 | 2 | 1.3% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 35.0 | 1 | 0.6% |
|
| 37.0 | 1 | 0.6% |
|
| 37.3 | 1 | 0.6% |
|
| 45.5 | 1 | 0.6% |
|
| 45.6 | 1 | 0.6% |
|
Confirmed
Numeric
| Distinct count | 152 |
|---|---|
| Unique (%) | 96.8% |
| Missing (%) | 3.8% |
| Missing (n) | 6 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.085396 |
|---|---|
| Minimum | 4.7059e-05 |
| Maximum | 0.64048 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 4.7059e-05 |
|---|---|
| 5-th percentile | 0.00061365 |
| Q1 | 0.0056523 |
| Median | 0.023208 |
| Q3 | 0.12528 |
| 95-th percentile | 0.36721 |
| Maximum | 0.64048 |
| Range | 0.64044 |
| Interquartile range | 0.11963 |
Descriptive statistics
| Standard deviation | 0.12869 |
|---|---|
| Coef of variation | 1.507 |
| Kurtosis | 3.9173 |
| Mean | 0.085396 |
| MAD | 0.095922 |
| Skewness | 2.0409 |
| Sum | 12.895 |
| Variance | 0.016561 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.020043836232332 | 1 | 0.6% |
|
| 0.0140169194865811 | 1 | 0.6% |
|
| 0.0514036458333333 | 1 | 0.6% |
|
| 0.11669987321137501 | 1 | 0.6% |
|
| 0.0144264955943732 | 1 | 0.6% |
|
| 0.0045710669840600205 | 1 | 0.6% |
|
| 0.17795408507764998 | 1 | 0.6% |
|
| 0.34251610858772596 | 1 | 0.6% |
|
| 0.0703065134099617 | 1 | 0.6% |
|
| 0.184714370452867 | 1 | 0.6% |
|
| Other values (141) | 141 | 89.8% |
|
| (Missing) | 6 | 3.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 4.70588235294118e-05 | 1 | 0.6% |
|
| 0.000165462818595475 | 1 | 0.6% |
|
| 0.000266292922214436 | 1 | 0.6% |
|
| 0.00032775691362239704 | 1 | 0.6% |
|
| 0.000347076615601495 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.48859312270902394 | 1 | 0.6% |
|
| 0.49230613484510993 | 1 | 0.6% |
|
| 0.494030548297325 | 1 | 0.6% |
|
| 0.499445983379501 | 1 | 0.6% |
|
| 0.6404838709677421 | 1 | 0.6% |
|
Deaths
Numeric
| Distinct count | 139 |
|---|---|
| Unique (%) | 88.5% |
| Missing (%) | 3.8% |
| Missing (n) | 6 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.004342 |
|---|---|
| Minimum | 0 |
| Maximum | 0.079857 |
| Zeros (%) | 8.9% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.00015992 |
| Median | 0.00055147 |
| Q3 | 0.0028374 |
| 95-th percentile | 0.02523 |
| Maximum | 0.079857 |
| Range | 0.079857 |
| Interquartile range | 0.0026775 |
Descriptive statistics
| Standard deviation | 0.011129 |
|---|---|
| Coef of variation | 2.563 |
| Kurtosis | 20.82 |
| Mean | 0.004342 |
| MAD | 0.005737 |
| Skewness | 4.2922 |
| Sum | 0.65564 |
| Variance | 0.00012385 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 14 | 8.9% |
|
| 0.0038834951456310704 | 1 | 0.6% |
|
| 1.61039239894788e-05 | 1 | 0.6% |
|
| 0.0122991527899503 | 1 | 0.6% |
|
| 0.000435172148982465 | 1 | 0.6% |
|
| 0.00134167519090325 | 1 | 0.6% |
|
| 0.00034088018315950097 | 1 | 0.6% |
|
| 0.000513384671800513 | 1 | 0.6% |
|
| 0.0592441526989994 | 1 | 0.6% |
|
| 0.0013599999999999999 | 1 | 0.6% |
|
| Other values (128) | 128 | 81.5% |
|
| (Missing) | 6 | 3.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 14 | 8.9% |
|
| 4.4611390180140805e-06 | 1 | 0.6% |
|
| 6.990807088678391e-06 | 1 | 0.6% |
|
| 7.41592198450072e-06 | 1 | 0.6% |
|
| 9.54593184204665e-06 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.0372447987555901 | 1 | 0.6% |
|
| 0.0433954406638492 | 1 | 0.6% |
|
| 0.0535752754992129 | 1 | 0.6% |
|
| 0.0592441526989994 | 1 | 0.6% |
|
| 0.0798568685634491 | 1 | 0.6% |
|
Recovered
Numeric
| Distinct count | 152 |
|---|---|
| Unique (%) | 96.8% |
| Missing (%) | 3.8% |
| Missing (n) | 6 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.044633 |
|---|---|
| Minimum | 0 |
| Maximum | 0.60129 |
| Zeros (%) | 0.6% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.00015598 |
| Q1 | 0.0023399 |
| Median | 0.01 |
| Q3 | 0.04947 |
| 95-th percentile | 0.16746 |
| Maximum | 0.60129 |
| Range | 0.60129 |
| Interquartile range | 0.04713 |
Descriptive statistics
| Standard deviation | 0.08753 |
|---|---|
| Coef of variation | 1.9611 |
| Kurtosis | 17.411 |
| Mean | 0.044633 |
| MAD | 0.052744 |
| Skewness | 3.7913 |
| Sum | 6.7396 |
| Variance | 0.0076615 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.18888808664259898 | 1 | 0.6% |
|
| 0.0141660917762923 | 1 | 0.6% |
|
| 0.0006961616007912661 | 1 | 0.6% |
|
| 0.0291976674039815 | 1 | 0.6% |
|
| 0.025568069551472 | 1 | 0.6% |
|
| 0.0353780617678381 | 1 | 0.6% |
|
| 0.000379346680716544 | 1 | 0.6% |
|
| 0.00276223420490386 | 1 | 0.6% |
|
| 0.00304002444240758 | 1 | 0.6% |
|
| 0.13798709552459199 | 1 | 0.6% |
|
| Other values (141) | 141 | 89.8% |
|
| (Missing) | 6 | 3.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 1 | 0.6% |
|
| 1.7145600438927402e-05 | 1 | 0.6% |
|
| 5.4093613771597706e-05 | 1 | 0.6% |
|
| 0.000108851792039544 | 1 | 0.6% |
|
| 0.000122908842608399 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.319452764854588 | 1 | 0.6% |
|
| 0.324311712552497 | 1 | 0.6% |
|
| 0.426402105689411 | 1 | 0.6% |
|
| 0.495567867036011 | 1 | 0.6% |
|
| 0.601290322580645 | 1 | 0.6% |
|
Active
Numeric
| Distinct count | 148 |
|---|---|
| Unique (%) | 94.3% |
| Missing (%) | 3.8% |
| Missing (n) | 6 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.036418 |
|---|---|
| Minimum | 0 |
| Maximum | 0.35315 |
| Zeros (%) | 3.2% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7.1749e-05 |
| Q1 | 0.0017281 |
| Median | 0.0087378 |
| Q3 | 0.030314 |
| 95-th percentile | 0.19483 |
| Maximum | 0.35315 |
| Range | 0.35315 |
| Interquartile range | 0.028586 |
Descriptive statistics
| Standard deviation | 0.065191 |
|---|---|
| Coef of variation | 1.7901 |
| Kurtosis | 6.6996 |
| Mean | 0.036418 |
| MAD | 0.0441 |
| Skewness | 2.5637 |
| Sum | 5.4991 |
| Variance | 0.0042498 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 5 | 3.2% |
|
| 0.00116597557728155 | 1 | 0.6% |
|
| 0.00023376368454393798 | 1 | 0.6% |
|
| 0.22089247520902203 | 1 | 0.6% |
|
| 0.00798325231123725 | 1 | 0.6% |
|
| 0.000525816739936212 | 1 | 0.6% |
|
| 0.00110803324099723 | 1 | 0.6% |
|
| 0.14995322731524802 | 1 | 0.6% |
|
| 0.0031978126485972397 | 1 | 0.6% |
|
| 8.225860675378931e-06 | 1 | 0.6% |
|
| Other values (137) | 137 | 87.3% |
|
| (Missing) | 6 | 3.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 5 | 3.2% |
|
| 8.225860675378931e-06 | 1 | 0.6% |
|
| 4.70588235294118e-05 | 1 | 0.6% |
|
| 7.007708479327261e-05 | 1 | 0.6% |
|
| 7.342143906020559e-05 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.22089247520902203 | 1 | 0.6% |
|
| 0.223311220074993 | 1 | 0.6% |
|
| 0.27915866643393294 | 1 | 0.6% |
|
| 0.29852626574756397 | 1 | 0.6% |
|
| 0.35315096626796705 | 1 | 0.6% |
|
Population_x
Numeric
| Distinct count | 157 |
|---|---|
| Unique (%) | 100.0% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 45795000 |
|---|---|
| Minimum | 97000 |
| Maximum | 1398000000 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 97000 |
|---|---|
| 5-th percentile | 307800 |
| Q1 | 3269000 |
| Median | 10023000 |
| Q3 | 31781000 |
| 95-th percentile | 150120000 |
| Maximum | 1398000000 |
| Range | 1397900000 |
| Interquartile range | 28512000 |
Descriptive statistics
| Standard deviation | 161640000 |
|---|---|
| Coef of variation | 3.5297 |
| Kurtosis | 61.722 |
| Mean | 45795000 |
| MAD | 56917000 |
| Skewness | 7.6198 |
| Sum | 7189800000 |
| Variance | 2.6128e+16 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 25305000.0 | 1 | 0.6% |
|
| 43406000.0 | 1 | 0.6% |
|
| 3997000.0 | 1 | 0.6% |
|
| 361000.0 | 1 | 0.6% |
|
| 5819000.0 | 1 | 0.6% |
|
| 11212000.0 | 1 | 0.6% |
|
| 31427000.0 | 1 | 0.6% |
|
| 17861000.0 | 1 | 0.6% |
|
| 29162000.0 | 1 | 0.6% |
|
| 16296000.0 | 1 | 0.6% |
|
| Other values (147) | 147 | 93.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 97000.0 | 1 | 0.6% |
|
| 111000.0 | 1 | 0.6% |
|
| 112000.0 | 1 | 0.6% |
|
| 123000.0 | 1 | 0.6% |
|
| 180000.0 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 216565000.0 | 1 | 0.6% |
|
| 268419000.0 | 1 | 0.6% |
|
| 329153000.0 | 1 | 0.6% |
|
| 1391885000.0 | 1 | 0.6% |
|
| 1398030000.0 | 1 | 0.6% |
|
Unit (all except Population)
Constant
This variable is constant and should be ignored for analysis
| Constant value | % |
|---|
Year
Constant
This variable is constant and should be ignored for analysis
| Constant value | 2015 |
|---|
Status
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 1.3% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Developing |
127
|
|---|---|
| Developed |
30
|
| Value | Count | Frequency (%) | |
| Developing | 127 | 80.9% |
|
| Developed | 30 | 19.1% |
|
Life expectancy
Numeric
| Distinct count | 118 |
|---|---|
| Unique (%) | 75.2% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 72.01 |
|---|---|
| Minimum | 51 |
| Maximum | 88 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 51 |
|---|---|
| 5-th percentile | 58.08 |
| Q1 | 66.3 |
| Median | 74.1 |
| Q3 | 77 |
| 95-th percentile | 82.72 |
| Maximum | 88 |
| Range | 37 |
| Interquartile range | 10.7 |
Descriptive statistics
| Standard deviation | 7.9091 |
|---|---|
| Coef of variation | 0.10983 |
| Kurtosis | -0.22973 |
| Mean | 72.01 |
| MAD | 6.4375 |
| Skewness | -0.53125 |
| Sum | 11306 |
| Variance | 62.554 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 75.0 | 4 | 2.5% |
|
| 61.8 | 3 | 1.9% |
|
| 76.1 | 3 | 1.9% |
|
| 81.1 | 3 | 1.9% |
|
| 75.5 | 3 | 1.9% |
|
| 74.6 | 3 | 1.9% |
|
| 74.9 | 3 | 1.9% |
|
| 74.8 | 3 | 1.9% |
|
| 65.7 | 3 | 1.9% |
|
| 69.2 | 2 | 1.3% |
|
| Other values (108) | 127 | 80.9% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 51.0 | 1 | 0.6% |
|
| 52.4 | 1 | 0.6% |
|
| 52.5 | 1 | 0.6% |
|
| 53.1 | 1 | 0.6% |
|
| 53.7 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 83.4 | 1 | 0.6% |
|
| 83.7 | 1 | 0.6% |
|
| 85.0 | 2 | 1.3% |
|
| 86.0 | 1 | 0.6% |
|
| 88.0 | 1 | 0.6% |
|
Adult Mortality
Numeric
| Distinct count | 117 |
|---|---|
| Unique (%) | 74.5% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 148.08 |
|---|---|
| Minimum | 1 |
| Maximum | 484 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 74 |
| Median | 137 |
| Q3 | 198 |
| 95-th percentile | 337.6 |
| Maximum | 484 |
| Range | 483 |
| Interquartile range | 124 |
Descriptive statistics
| Standard deviation | 94.675 |
|---|---|
| Coef of variation | 0.63934 |
| Kurtosis | 0.60037 |
| Mean | 148.08 |
| MAD | 74.362 |
| Skewness | 0.85128 |
| Sum | 23249 |
| Variance | 8963.4 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 118.0 | 4 | 2.5% |
|
| 74.0 | 3 | 1.9% |
|
| 19.0 | 3 | 1.9% |
|
| 13.0 | 3 | 1.9% |
|
| 95.0 | 3 | 1.9% |
|
| 249.0 | 3 | 1.9% |
|
| 16.0 | 2 | 1.3% |
|
| 152.0 | 2 | 1.3% |
|
| 222.0 | 2 | 1.3% |
|
| 146.0 | 2 | 1.3% |
|
| Other values (107) | 130 | 82.8% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 1.0 | 1 | 0.6% |
|
| 13.0 | 3 | 1.9% |
|
| 16.0 | 2 | 1.3% |
|
| 17.0 | 1 | 0.6% |
|
| 19.0 | 3 | 1.9% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 357.0 | 1 | 0.6% |
|
| 365.0 | 1 | 0.6% |
|
| 397.0 | 1 | 0.6% |
|
| 413.0 | 1 | 0.6% |
|
| 484.0 | 1 | 0.6% |
|
percentage expenditure
Numeric
| Distinct count | 3 |
|---|---|
| Unique (%) | 1.9% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 2.7787 |
|---|---|
| Minimum | 0 |
| Maximum | 364.98 |
| Zeros (%) | 98.7% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 364.98 |
| Range | 364.98 |
| Interquartile range | 0 |
Descriptive statistics
| Standard deviation | 29.643 |
|---|---|
| Coef of variation | 10.668 |
| Kurtosis | 145.62 |
| Mean | 2.7787 |
| MAD | 5.4866 |
| Skewness | 11.924 |
| Sum | 436.25 |
| Variance | 878.69 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.0 | 155 | 98.7% |
|
| 71.27962362 | 1 | 0.6% |
|
| 364.9752287 | 1 | 0.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 155 | 98.7% |
|
| 71.27962362 | 1 | 0.6% |
|
| 364.9752287 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 0.0 | 155 | 98.7% |
|
| 71.27962362 | 1 | 0.6% |
|
| 364.9752287 | 1 | 0.6% |
|
BMI
Numeric
| Distinct count | 126 |
|---|---|
| Unique (%) | 80.3% |
| Missing (%) | 0.6% |
| Missing (n) | 1 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 43.297 |
|---|---|
| Minimum | 2.5 |
| Maximum | 77.6 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 2.5 |
|---|---|
| 5-th percentile | 5.825 |
| Q1 | 24.175 |
| Median | 52.6 |
| Q3 | 61.45 |
| 95-th percentile | 66.7 |
| Maximum | 77.6 |
| Range | 75.1 |
| Interquartile range | 37.275 |
Descriptive statistics
| Standard deviation | 20.804 |
|---|---|
| Coef of variation | 0.48049 |
| Kurtosis | -1.2007 |
| Mean | 43.297 |
| MAD | 18.854 |
| Skewness | -0.44637 |
| Sum | 6754.4 |
| Variance | 432.81 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 61.2 | 3 | 1.9% |
|
| 62.1 | 3 | 1.9% |
|
| 66.1 | 3 | 1.9% |
|
| 27.4 | 3 | 1.9% |
|
| 19.1 | 3 | 1.9% |
|
| 25.4 | 3 | 1.9% |
|
| 23.8 | 3 | 1.9% |
|
| 63.7 | 2 | 1.3% |
|
| 24.3 | 2 | 1.3% |
|
| 23.4 | 2 | 1.3% |
|
| Other values (115) | 129 | 82.2% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 2.5 | 1 | 0.6% |
|
| 3.8 | 1 | 0.6% |
|
| 3.9 | 1 | 0.6% |
|
| 4.6 | 1 | 0.6% |
|
| 4.7 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 68.2 | 1 | 0.6% |
|
| 69.6 | 2 | 1.3% |
|
| 71.4 | 1 | 0.6% |
|
| 74.7 | 1 | 0.6% |
|
| 77.6 | 1 | 0.6% |
|
under-five deaths
Numeric
| Distinct count | 50 |
|---|---|
| Unique (%) | 31.8% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 32.815 |
|---|---|
| Minimum | 0 |
| Maximum | 1100 |
| Zeros (%) | 28.7% |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| Median | 3 |
| Q3 | 21 |
| 95-th percentile | 101 |
| Maximum | 1100 |
| Range | 1100 |
| Interquartile range | 21 |
Descriptive statistics
| Standard deviation | 113.69 |
|---|---|
| Coef of variation | 3.4645 |
| Kurtosis | 60.121 |
| Mean | 32.815 |
| MAD | 44.281 |
| Skewness | 7.299 |
| Sum | 5152 |
| Variance | 12925 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0 | 45 | 28.7% |
|
| 1 | 21 | 13.4% |
|
| 3 | 9 | 5.7% |
|
| 2 | 9 | 5.7% |
|
| 12 | 4 | 2.5% |
|
| 4 | 3 | 1.9% |
|
| 10 | 3 | 1.9% |
|
| 21 | 3 | 1.9% |
|
| 5 | 3 | 1.9% |
|
| 11 | 3 | 1.9% |
|
| Other values (40) | 54 | 34.4% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0 | 45 | 28.7% |
|
| 1 | 21 | 13.4% |
|
| 2 | 9 | 5.7% |
|
| 3 | 9 | 5.7% |
|
| 4 | 3 | 1.9% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 183 | 1 | 0.6% |
|
| 194 | 1 | 0.6% |
|
| 433 | 1 | 0.6% |
|
| 747 | 1 | 0.6% |
|
| 1100 | 1 | 0.6% |
|
Polio
Numeric
| Distinct count | 38 |
|---|---|
| Unique (%) | 24.2% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 83.917 |
|---|---|
| Minimum | 6 |
| Maximum | 99 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 84 |
| Median | 93 |
| Q3 | 97 |
| 95-th percentile | 99 |
| Maximum | 99 |
| Range | 93 |
| Interquartile range | 13 |
Descriptive statistics
| Standard deviation | 23.265 |
|---|---|
| Coef of variation | 0.27723 |
| Kurtosis | 4.7838 |
| Mean | 83.917 |
| MAD | 15.743 |
| Skewness | -2.3297 |
| Sum | 13175 |
| Variance | 541.24 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 99.0 | 27 | 17.2% |
|
| 97.0 | 14 | 8.9% |
|
| 93.0 | 11 | 7.0% |
|
| 95.0 | 10 | 6.4% |
|
| 98.0 | 9 | 5.7% |
|
| 96.0 | 8 | 5.1% |
|
| 88.0 | 7 | 4.5% |
|
| 92.0 | 7 | 4.5% |
|
| 91.0 | 6 | 3.8% |
|
| 89.0 | 6 | 3.8% |
|
| Other values (28) | 52 | 33.1% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 6.0 | 1 | 0.6% |
|
| 7.0 | 1 | 0.6% |
|
| 8.0 | 4 | 2.5% |
|
| 9.0 | 4 | 2.5% |
|
| 42.0 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 95.0 | 10 | 6.4% |
|
| 96.0 | 8 | 5.1% |
|
| 97.0 | 14 | 8.9% |
|
| 98.0 | 9 | 5.7% |
|
| 99.0 | 27 | 17.2% |
|
Diphtheria
Numeric
| Distinct count | 38 |
|---|---|
| Unique (%) | 24.2% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 85.478 |
|---|---|
| Minimum | 6 |
| Maximum | 99 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 41.4 |
| Q1 | 86 |
| Median | 93 |
| Q3 | 97 |
| 95-th percentile | 99 |
| Maximum | 99 |
| Range | 93 |
| Interquartile range | 11 |
Descriptive statistics
| Standard deviation | 21.122 |
|---|---|
| Coef of variation | 0.2471 |
| Kurtosis | 6.4406 |
| Mean | 85.478 |
| MAD | 13.944 |
| Skewness | -2.5729 |
| Sum | 13420 |
| Variance | 446.12 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 99.0 | 23 | 14.6% |
|
| 95.0 | 14 | 8.9% |
|
| 97.0 | 14 | 8.9% |
|
| 98.0 | 13 | 8.3% |
|
| 93.0 | 11 | 7.0% |
|
| 89.0 | 8 | 5.1% |
|
| 96.0 | 8 | 5.1% |
|
| 91.0 | 7 | 4.5% |
|
| 87.0 | 6 | 3.8% |
|
| 92.0 | 5 | 3.2% |
|
| Other values (28) | 48 | 30.6% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 6.0 | 2 | 1.3% |
|
| 8.0 | 3 | 1.9% |
|
| 9.0 | 2 | 1.3% |
|
| 23.0 | 1 | 0.6% |
|
| 46.0 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 95.0 | 14 | 8.9% |
|
| 96.0 | 8 | 5.1% |
|
| 97.0 | 14 | 8.9% |
|
| 98.0 | 13 | 8.3% |
|
| 99.0 | 23 | 14.6% |
|
HIV/AIDS
Numeric
| Distinct count | 28 |
|---|---|
| Unique (%) | 17.8% |
| Missing (%) | 0.0% |
| Missing (n) | 0 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 0.6242 |
|---|---|
| Minimum | 0.1 |
| Maximum | 9.3 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 0.1 |
| Q1 | 0.1 |
| Median | 0.1 |
| Q3 | 0.3 |
| 95-th percentile | 3.52 |
| Maximum | 9.3 |
| Range | 9.2 |
| Interquartile range | 0.2 |
Descriptive statistics
| Standard deviation | 1.293 |
|---|---|
| Coef of variation | 2.0715 |
| Kurtosis | 15.607 |
| Mean | 0.6242 |
| MAD | 0.78276 |
| Skewness | 3.5465 |
| Sum | 98 |
| Variance | 1.672 |
| Memory size | 2.5 KiB |
| Value | Count | Frequency (%) | |
| 0.1 | 100 | 63.7% |
|
| 0.2 | 11 | 7.0% |
|
| 0.3 | 9 | 5.7% |
|
| 0.4 | 4 | 2.5% |
|
| 0.5 | 4 | 2.5% |
|
| 2.8 | 3 | 1.9% |
|
| 1.0 | 2 | 1.3% |
|
| 0.6 | 2 | 1.3% |
|
| 2.1 | 2 | 1.3% |
|
| 0.9 | 2 | 1.3% |
|
| Other values (18) | 18 | 11.5% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 0.1 | 100 | 63.7% |
|
| 0.2 | 11 | 7.0% |
|
| 0.3 | 9 | 5.7% |
|
| 0.4 | 4 | 2.5% |
|
| 0.5 | 4 | 2.5% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 4.1 | 1 | 0.6% |
|
| 4.4 | 1 | 0.6% |
|
| 4.8 | 1 | 0.6% |
|
| 6.2 | 1 | 0.6% |
|
| 9.3 | 1 | 0.6% |
|
Population_y
Numeric
| Distinct count | 133 |
|---|---|
| Unique (%) | 84.7% |
| Missing (%) | 15.9% |
| Missing (n) | 25 |
| Infinite (%) | 0.0% |
| Infinite (n) | 0 |
| Mean | 11623000 |
|---|---|
| Minimum | 2966 |
| Maximum | 258160000 |
| Zeros (%) | 0.0% |
Quantile statistics
| Minimum | 2966 |
|---|---|
| 5-th percentile | 31869 |
| Q1 | 289190 |
| Median | 2424500 |
| Q3 | 10309000 |
| 95-th percentile | 44781000 |
| Maximum | 258160000 |
| Range | 258160000 |
| Interquartile range | 10020000 |
Descriptive statistics
| Standard deviation | 29933000 |
|---|---|
| Coef of variation | 2.5754 |
| Kurtosis | 42.463 |
| Mean | 11623000 |
| MAD | 14118000 |
| Skewness | 5.9702 |
| Sum | 1534200000 |
| Variance | 895970000000000 |
| Memory size | 7.5 KiB |
| Value | Count | Frequency (%) | |
| 487852.0 | 1 | 0.6% |
|
| 896829.0 | 1 | 0.6% |
|
| 11629553.0 | 1 | 0.6% |
|
| 8381.0 | 1 | 0.6% |
|
| 126265.0 | 1 | 0.6% |
|
| 6312478.0 | 1 | 0.6% |
|
| 49163.0 | 1 | 0.6% |
|
| 56964.0 | 1 | 0.6% |
|
| 17762681.0 | 1 | 0.6% |
|
| 622159.0 | 1 | 0.6% |
|
| Other values (122) | 122 | 77.7% |
|
| (Missing) | 25 | 15.9% |
|
Minimum 5 values
| Value | Count | Frequency (%) | |
| 2966.0 | 1 | 0.6% |
|
| 8381.0 | 1 | 0.6% |
|
| 11247.0 | 1 | 0.6% |
|
| 13692.0 | 1 | 0.6% |
|
| 26463.0 | 1 | 0.6% |
|
Maximum 5 values
| Value | Count | Frequency (%) | |
| 48228697.0 | 1 | 0.6% |
|
| 78271472.0 | 1 | 0.6% |
|
| 81686611.0 | 1 | 0.6% |
|
| 181181744.0 | 1 | 0.6% |
|
| 258162113.0 | 1 | 0.6% |
|
Income composition of resources
Highly correlated
This variable is highly correlated with Life expectancy and should be ignored for analysis
| Correlation | 0.90758 |
|---|
Schooling
Highly correlated
This variable is highly correlated with Income composition of resources and should be ignored for analysis
| Correlation | 0.92639 |
|---|
| Country | Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | Oilcrops | Pulses | Spices | Starchy Roots | Stimulants | Treenuts | Vegetal Products | Vegetable Oils | Vegetables | Obesity | Confirmed | Deaths | Recovered | Active | Population_x | Unit (all except Population) | Year | Status | Life expectancy | Adult Mortality | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population_y | Income composition of resources | Schooling | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 21.6397 | 6.2224 | 8.0353 | 0.6859 | 0.0327 | 0.4246 | 6.1244 | 8.2803 | 0.3103 | 1.0452 | 0.1960 | 0.2776 | 0.0490 | 0.0980 | 0.7513 | 28.3684 | 17.0831 | 0.3593 | 4.5 | 0.021411 | 0.000492 | 0.002445 | 0.018474 | 38042000.0 | % | 2015 | Developing | 65.0 | 263.0 | 71.279624 | 19.1 | 83 | 6.0 | 65.0 | 0.1 | 33736494.0 | 0.479 | 10.1 |
| 1 | Albania | 32.0002 | 3.4172 | 2.6734 | 1.6448 | 0.1445 | 0.6418 | 8.7428 | 17.7576 | 0.2933 | 3.1622 | 0.1148 | 0.0000 | 0.0510 | 0.5270 | 0.9181 | 17.9998 | 9.2443 | 0.6503 | 22.3 | 0.033730 | 0.001085 | 0.026522 | 0.006123 | 2858000.0 | % | 2015 | Developing | 77.8 | 74.0 | 364.975229 | 58.0 | 0 | 99.0 | 99.0 | 0.1 | 28873.0 | 0.762 | 14.2 |
| 2 | Algeria | 14.4175 | 0.8972 | 4.2035 | 1.2171 | 0.2008 | 0.5772 | 3.8961 | 8.0934 | 0.1067 | 1.1983 | 0.2698 | 0.1568 | 0.1129 | 0.2886 | 0.8595 | 35.5857 | 27.3606 | 0.5145 | 26.6 | 0.017375 | 0.001309 | 0.009142 | 0.006925 | 43406000.0 | % | 2015 | Developing | 75.6 | 19.0 | 0.000000 | 59.5 | 24 | 95.0 | 95.0 | 0.1 | 39871528.0 | 0.743 | 14.4 |
| 3 | Angola | 15.3041 | 1.3130 | 6.5545 | 0.1539 | 1.4155 | 0.3488 | 11.0268 | 1.2309 | 0.1539 | 3.9902 | 0.3282 | 0.0103 | 0.7078 | 0.1128 | 0.0308 | 34.7010 | 22.4638 | 0.1231 | 6.8 | 0.000165 | 0.000010 | 0.000054 | 0.000102 | 31427000.0 | % | 2015 | Developing | 52.4 | 335.0 | 0.000000 | 23.3 | 98 | 7.0 | 64.0 | 1.9 | 2785935.0 | 0.531 | 11.4 |
| 4 | Antigua and Barbuda | 27.7033 | 4.6686 | 3.2153 | 0.3872 | 1.5263 | 1.2177 | 14.3202 | 6.6607 | 0.1347 | 1.3579 | 0.0673 | 0.3591 | 0.0449 | 1.0549 | 0.2020 | 22.2995 | 14.4436 | 0.2469 | 19.1 | 0.025773 | 0.003093 | 0.019588 | 0.003093 | 97000.0 | % | 2015 | Developing | 76.4 | 13.0 | 0.000000 | 47.7 | 0 | 86.0 | 99.0 | 0.2 | NaN | 0.784 | 13.9 |
Highly correlated =or Above +/- .5 correlation: Schooling w/'animal fat'
To control for availability of ventilators or other country financed health interventions and infrastructure to deal with health emergencies, like pandemics.
1) Note Major Finding of Study, caveat of this data set: "The GHS Index analysis finds no country is fully prepared for epidemics or pandemics. Collectively, international preparedness is weak. Many countries do not show evidence of the health security capacities and capabilities that are needed to prevent, detect, and respond to significant infectious disease outbreaks. The average overall GHS Index score among all 195 countries assessed is 40.2 of a possible score of 100." ~Index: Global Healthy Security Index https://www.ghsindex.org/wp-content/uploads/2019/10/2019-Global-Health-Security-Index.pdf Although 86% of countries invest local or donor funds in health security, few countries pay for health security gap assessments and action plans out of national budgets.
2) Variable: Ventilators
3) Variable: Critical Care Beds Sampling (Per 100,000) Saudi Arabia -22.8 per 100,000 (high) Pakistan - 1.5 (lower-middle) Iran - 4.6 (upper-middle medium) Oman - 14.6 (high) Yemen - 0.0000245614 (700 beds for country population of 28.5 million) Source: https://www.researchgate.net/figure/Number-of-critical-care-beds-per-100-000-population_fig1_338520008 expenditure health - https://link.springer.com/article/10.1007/s00134-012-2627-8
# Define a dictionary containing Health Index
health_score_data = {'Country': ['United States','United Kingdom', 'Netherlands', 'Australia','Canada','Thailand','Sweden',
'Denmark','South Korea','Finland','France','Slovenia','Switzerland','Germany','Spain','Norway',
'Latvia','Malaysia','Belgium','Portugal','Japan', 'Brazil','Ireland','Singapore','Argentina',
'Austria','Chile','Mexico','Estonia','Indonesia','Italy','Poland','Lithuania','South Africa',
'Hungary','New Zealand','Greece','Croatia','Albania','Turkey','Serbia','Czech Republic','Georgia',
'Armenia','Ecuador','Mongolia','Kyrgyz Republic','Saudi Arabia','Peru','Vietnam','China','Slovakia',
'Philippines','Israel','Kenya','United Arab Emirates','India','Iceland','Kuwait','Romania',
'Bulgaria','Costa Rica','Russia','Uganda','Colombia','El Salvador','Luxembourg','Montenegro','Morocco',
'Panama','Liechtenstein','Myanmar','Laos','Lebanon','Nicaragua','Oman','Cyprus','Moldova',
'Bosnia and Herzegovina','Jordan','Uruguay','Qatar','Kazakhstan','Ethiopia','Bhutan','Madagascar',
'Egypt','Bahrain','Cambodia','North Macedonia','Dominican Republic','Sierra Leone','Zimbabwe','Ukraine',
'Senegal','Nigeria','Iran','Malta','Trinidad and Tobago','Suriname','Tanzania','Bolivia',
'Paraguay','Namibia',"Côte d'Ivoire",'Ghana','Pakistan','Belarus','St. Lucia','Cuba','Liberia','Nepal',
'Bangladesh','Mauritius','Cameroon','Uzbekistan','Azerbaijan','Gambia','Rwanda','Sri Lanka','Maldives',
'Tunisia','St. Vincent and The Grenadines','Micronesia','Guatemala','Guinea','Monaco','Brunei','Togo',
'Afghanistan','Tajikistan','Niger','Barbados','Seychelles','Belize','Turkmenistan', 'Guyana','Haiti',
'Botswana','San Marino','Swaziland','Bahamas','Andorra','Lesotho','Burkina Faso','Cabo Verde',
'Antigua and Barbuda','Jamaica','Mali','Benin','Chad','Zambia','Mozambique','Malawi',
'Papua New Guinea','Honduras','Grenada','Mauritania','Central African Republic','Comoros','Congo','Samoa',
'St. Kitts and Nevis','Sudan','Vanuatu','Timor-Leste','Iraq','Fiji','Libya','Angola','Tonga',
'Dominica','Algeria','Brazzaville','Djibouti','Venezuela','Burundi','Eritrea','Palau','South Sudan','Tuvalu',
'Nauru','Solomon Islands','Niue','Cook Islands','Gabon','Guinea-Bissau','Syria','Kiribati',
'Yemen','Marshall Islands','São Tomé and Príncipe','North Korea','Somalia','Equatorial Guinea'],
'Health_index': [83.5,77.9,75.6,75.5,75.3,73.2,72.1,70.4,70.2,68.7,68.2,67.2,67.0,66.0,65.9,64.6,62.9,62.2,61.0,
60.3,59.8,59.7,59.0,58.7,58.6,58.5,58.3,57.6,57.0,56.6,56.2,55.4,55.0,54.8,54.0,54.0,53.8,53.3,52.9,52.4,
52.3,52.0,52.0,50.2,50.1,49.5,49.3,49.3,49.2,49.1,48.2,47.9,47.6,47.3,47.1,46.7,46.5,46.3,46.1,45.8,45.6,
45.1,44.3,44.3,44.2,44.2,43.8,43.7,43.7,43.7,43.5,43.4,43.1,43.1,43.1,43.1,43.0,42.9,42.8,42.1,
41.3,41.2,40.7,40.6,40.3,40.1,39.9,39.4,39.2,39.1,38.3,38.2,38.2,38.0,37.9,37.8,37.7,37.3,36.6,36.5,36.4,
35.8,35.7,35.6,35.5,35.5,35.5,35.3,35.3,35.2,35.1,35.1,35.0,34.9,34.4,34.3,34.2,34.2,34.2,33.9,33.8,33.7,
33.0,32.8,32.7,32.7,32.7,32.6,32.5,32.3,32.3,32.2,31.9,31.9,31.8,31.8,31.7,31.5,31.1,31.1,31.1,30.6,30.5,
30.2,30.1,29.3,29.0,29.0,29.0,28.8,28.8,28.7,28.1,28.0,27.8,27.6,27.5,27.5,27.3,27.2,26.5,26.4,26.2,26.2,
26.1,26.0,25.8,25.7,25.7,25.2,25.1,24.0,23.6,23.6,23.2,23.0,22.8,22.4,21.9,21.7,21.6,20.8,20.7,20.5,
20.4,20.0,20.0,19.9,19.2,18.5,18.2,17.7,17.5,16.6,16.2]}
# Convert the dictionary to add into DataFrame
health_df = pd.DataFrame(health_score_data)
# Using DataFrame.insert() to add the column into covid_df
#covid_df.insert(df2["health_score_data"])
#Merge the 2 data sets
merged_df = pd.merge(covid_df, health_df, left_on='Country', right_on='Country')
merged_df.describe()
| Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | Oilcrops | ... | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population_y | Income composition of resources | Schooling | Health_index | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | ... | 145.000000 | 144.000000 | 145.000000 | 145.000000 | 145.000000 | 145.000000 | 1.300000e+02 | 145.000000 | 145.000000 | 145.000000 |
| mean | 20.523469 | 4.186661 | 4.506248 | 0.961694 | 0.853244 | 0.550711 | 9.116830 | 5.255140 | 0.149395 | 3.347308 | ... | 3.008654 | 43.340972 | 34.110345 | 83.041379 | 84.703448 | 0.657241 | 1.178883e+07 | 0.697276 | 13.077241 | 42.395172 |
| std | 8.081516 | 3.432512 | 3.257513 | 0.666706 | 0.976601 | 0.899651 | 4.345607 | 3.348297 | 0.116589 | 4.752064 | ... | 30.841796 | 20.697557 | 117.879418 | 23.977094 | 21.768710 | 1.336545 | 3.013329e+07 | 0.155513 | 2.938318 | 13.696793 |
| min | 5.018200 | 0.034800 | 0.990800 | 0.058000 | 0.017400 | 0.037300 | 0.906100 | 0.177900 | 0.000000 | 0.064000 | ... | 0.000000 | 2.500000 | 0.000000 | 6.000000 | 6.000000 | 0.100000 | 2.966000e+03 | 0.347000 | 5.400000 | 18.500000 |
| 25% | 13.676600 | 1.611300 | 2.118700 | 0.350900 | 0.318900 | 0.237300 | 6.101000 | 2.229600 | 0.076300 | 0.856600 | ... | 0.000000 | 24.300000 | 0.000000 | 83.000000 | 84.000000 | 0.100000 | 2.970185e+05 | 0.575000 | 10.800000 | 31.900000 |
| 50% | 20.151100 | 3.221300 | 3.447900 | 0.901900 | 0.563300 | 0.351700 | 8.912700 | 5.127900 | 0.118000 | 1.580300 | ... | 0.000000 | 52.600000 | 3.000000 | 93.000000 | 93.000000 | 0.100000 | 2.510890e+06 | 0.734000 | 13.100000 | 40.100000 |
| 75% | 26.910000 | 6.378700 | 5.737400 | 1.266400 | 1.045700 | 0.577200 | 11.558100 | 7.592600 | 0.188500 | 3.438900 | ... | 0.000000 | 61.450000 | 21.000000 | 97.000000 | 97.000000 | 0.300000 | 1.095208e+07 | 0.804000 | 15.300000 | 52.300000 |
| max | 36.901800 | 14.937300 | 18.376300 | 3.275600 | 8.406800 | 9.672700 | 21.606200 | 17.757600 | 0.726800 | 28.563900 | ... | 364.975229 | 77.600000 | 1100.000000 | 99.000000 | 99.000000 | 9.300000 | 2.581621e+08 | 0.948000 | 20.400000 | 75.600000 |
8 rows × 37 columns
a = merged_df['Health_index']
b = merged_df['Recovered']
plt.scatter(a,b, color ='pink')
plt.xlabel('Health Score Index')
plt.ylabel("Recovered")
plt.title('Health Score Index verus Recovered')
plt.show()
Note thata HSI indicates access to infrastructure, like more ventilators and critical beds). However, we see that that countries having the highest "Health Score Index" (HSI) do not present the highest 'Recovered' rates for COVID_19 cases. Actually, those in the mid range of HSI, present the highest recovery rate 'Recovered'. Other contributing factors, like nutrition or dietary choices, may factor into the features importance.
We will conduct a 'features importance' test. Below, we will show in visuals 2-5 where our nutritional factors correlate with each other and our target of interest: the "Recovered" rate for COVID-19 cases.
merged_df.columns
Index(['Country', 'Animal Products', 'Animal fats', 'Cereals - Excluding Beer',
'Eggs', 'Fish, Seafood', 'Fruits - Excluding Wine', 'Meat',
'Milk - Excluding Butter', 'Offals', 'Oilcrops', 'Pulses', 'Spices',
'Starchy Roots', 'Stimulants', 'Treenuts', 'Vegetal Products',
'Vegetable Oils', 'Vegetables', 'Obesity', 'Confirmed', 'Deaths',
'Recovered', 'Active', 'Population_x', 'Unit (all except Population)',
'Year', 'Status', 'Life expectancy ', 'Adult Mortality',
'percentage expenditure', ' BMI ', 'under-five deaths ', 'Polio',
'Diphtheria ', ' HIV/AIDS', 'Population_y',
'Income composition of resources', 'Schooling', 'Health_index'],
dtype='object')
Need to fill in missing values for Percentage Expenditure, BMI, Polio, Treenuts, Pulses, Vegetables,
merged_df.corr()
| Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | Oilcrops | ... | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population_y | Income composition of resources | Schooling | Health_index | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Animal Products | 1.000000 | 0.710333 | -0.478648 | 0.469318 | -0.022442 | -0.101656 | 0.736385 | 0.641814 | 0.029119 | -0.413549 | ... | 0.118921 | 0.437910 | -0.199489 | 0.398435 | 0.295216 | -0.375809 | -0.180596 | 0.701782 | 0.629505 | 0.515481 |
| Animal fats | 0.710333 | 1.000000 | -0.416339 | 0.262446 | -0.116403 | -0.159390 | 0.251783 | 0.351140 | -0.194524 | -0.329150 | ... | -0.008903 | 0.379134 | -0.036428 | 0.289351 | 0.248572 | -0.317849 | -0.084778 | 0.596489 | 0.563103 | 0.506550 |
| Cereals - Excluding Beer | -0.478648 | -0.416339 | 1.000000 | -0.308149 | -0.054167 | 0.009134 | -0.299150 | -0.272598 | 0.279798 | 0.114859 | ... | -0.028851 | -0.444254 | 0.173847 | -0.423549 | -0.270659 | 0.515659 | 0.060297 | -0.632969 | -0.571845 | -0.444253 |
| Eggs | 0.469318 | 0.262446 | -0.308149 | 1.000000 | 0.223854 | -0.052851 | 0.263722 | 0.261878 | -0.138007 | -0.321104 | ... | 0.077561 | 0.232045 | -0.109087 | 0.274823 | 0.222723 | -0.389517 | 0.057395 | 0.560508 | 0.451998 | 0.458177 |
| Fish, Seafood | -0.022442 | -0.116403 | -0.054167 | 0.223854 | 1.000000 | 0.018571 | -0.006628 | -0.259120 | -0.099272 | 0.346591 | ... | -0.073124 | -0.113074 | -0.062338 | -0.010877 | -0.042181 | -0.077028 | 0.035187 | 0.010264 | 0.001983 | -0.068667 |
| Fruits - Excluding Wine | -0.101656 | -0.159390 | 0.009134 | -0.052851 | 0.018571 | 1.000000 | -0.024509 | -0.047790 | 0.079720 | 0.032919 | ... | 0.006071 | -0.065394 | -0.037703 | 0.037670 | -0.026199 | -0.047501 | 0.023415 | -0.092409 | -0.057522 | -0.104336 |
| Meat | 0.736385 | 0.251783 | -0.299150 | 0.263722 | -0.006628 | -0.024509 | 1.000000 | 0.163367 | 0.209309 | -0.228055 | ... | -0.018125 | 0.273449 | -0.263508 | 0.226785 | 0.094194 | -0.137876 | -0.192964 | 0.405613 | 0.372530 | 0.244104 |
| Milk - Excluding Butter | 0.641814 | 0.351140 | -0.272598 | 0.261878 | -0.259120 | -0.047790 | 0.163367 | 1.000000 | 0.019658 | -0.403036 | ... | 0.321355 | 0.310507 | -0.062141 | 0.325657 | 0.315474 | -0.312569 | -0.122456 | 0.450697 | 0.376794 | 0.343129 |
| Offals | 0.029119 | -0.194524 | 0.279798 | -0.138007 | -0.099272 | 0.079720 | 0.209309 | 0.019658 | 1.000000 | 0.027023 | ... | 0.123583 | -0.180217 | -0.001941 | -0.187155 | -0.346686 | 0.298875 | -0.039800 | -0.271129 | -0.253385 | -0.175541 |
| Oilcrops | -0.413549 | -0.329150 | 0.114859 | -0.321104 | 0.346591 | 0.032919 | -0.228055 | -0.403036 | 0.027023 | 1.000000 | ... | -0.010976 | -0.163590 | 0.038290 | -0.264518 | -0.202584 | 0.109738 | -0.009143 | -0.369430 | -0.344473 | -0.382370 |
| Pulses | -0.419936 | -0.311695 | 0.409971 | -0.327025 | -0.092894 | 0.497729 | -0.320564 | -0.188129 | 0.071100 | 0.145602 | ... | -0.037511 | -0.361713 | 0.280064 | -0.110328 | -0.119864 | 0.172618 | -0.006194 | -0.516880 | -0.477461 | -0.267498 |
| Spices | -0.209950 | -0.194603 | 0.100376 | 0.011249 | 0.221450 | 0.003150 | -0.237087 | -0.062440 | -0.109078 | 0.127786 | ... | -0.049728 | -0.176725 | 0.147135 | 0.015127 | 0.114354 | -0.040463 | 0.023170 | -0.134220 | -0.133312 | -0.087904 |
| Starchy Roots | -0.387591 | -0.302344 | 0.202710 | -0.350042 | 0.172502 | 0.447195 | -0.166073 | -0.395585 | 0.143363 | 0.304605 | ... | -0.046320 | -0.352279 | 0.137222 | -0.212275 | -0.298081 | 0.295374 | 0.120540 | -0.449930 | -0.341704 | -0.321883 |
| Stimulants | 0.495903 | 0.296297 | -0.270974 | 0.289669 | 0.020412 | -0.081584 | 0.290229 | 0.455033 | -0.062976 | -0.274782 | ... | -0.023394 | 0.242331 | -0.204802 | 0.271028 | 0.231936 | -0.257840 | -0.149345 | 0.474446 | 0.394878 | 0.214624 |
| Treenuts | 0.188401 | 0.164782 | -0.253826 | 0.306574 | 0.201834 | -0.098389 | -0.038856 | 0.222835 | -0.181652 | -0.209238 | ... | 0.025898 | 0.238155 | -0.088916 | 0.203553 | 0.230329 | -0.182906 | 0.020511 | 0.318894 | 0.289013 | 0.258659 |
| Vegetal Products | -1.000000 | -0.710321 | 0.478633 | -0.469364 | 0.022396 | 0.101486 | -0.736382 | -0.641805 | -0.029135 | 0.413568 | ... | -0.118908 | -0.437913 | 0.199511 | -0.398488 | -0.295246 | 0.375775 | 0.180667 | -0.701788 | -0.629521 | -0.515501 |
| Vegetable Oils | -0.669238 | -0.389604 | 0.023742 | -0.211264 | -0.240058 | -0.072968 | -0.548870 | -0.385380 | -0.175089 | -0.243485 | ... | -0.116374 | -0.184612 | 0.128582 | -0.126639 | -0.112706 | 0.157890 | 0.190102 | -0.282703 | -0.242394 | -0.127352 |
| Vegetables | 0.071712 | -0.100507 | -0.012157 | 0.183638 | -0.018436 | 0.033612 | -0.026554 | 0.278164 | 0.036590 | -0.126193 | ... | 0.153100 | 0.060095 | 0.054947 | 0.067792 | 0.066522 | -0.165925 | 0.090252 | 0.055665 | -0.016862 | -0.041928 |
| Obesity | 0.435854 | 0.392369 | -0.489569 | 0.291984 | -0.147099 | -0.088428 | 0.290266 | 0.266737 | -0.257126 | -0.119119 | ... | 0.006395 | 0.780198 | -0.327034 | 0.312374 | 0.305877 | -0.375988 | -0.076333 | 0.687399 | 0.627812 | 0.327343 |
| Confirmed | 0.385509 | 0.363431 | -0.358845 | 0.273573 | 0.066726 | -0.053812 | 0.188795 | 0.247044 | -0.147456 | -0.278321 | ... | -0.040638 | 0.440685 | -0.161320 | 0.270600 | 0.173125 | -0.229008 | -0.014421 | 0.536300 | 0.472649 | 0.456909 |
| Deaths | 0.244084 | 0.340994 | -0.296291 | 0.118489 | -0.017931 | -0.065005 | 0.050055 | 0.160422 | -0.159473 | -0.193428 | ... | -0.029572 | 0.310681 | -0.098364 | 0.176261 | 0.142232 | -0.155151 | 0.020222 | 0.399791 | 0.383512 | 0.472358 |
| Recovered | 0.348826 | 0.310547 | -0.294936 | 0.169268 | -0.002495 | -0.044328 | 0.204846 | 0.232973 | -0.144680 | -0.223579 | ... | -0.025195 | 0.363214 | -0.129881 | 0.218277 | 0.163858 | -0.189420 | -0.008604 | 0.472593 | 0.421753 | 0.355128 |
| Active | 0.249087 | 0.241329 | -0.263337 | 0.303952 | 0.148323 | -0.035477 | 0.084228 | 0.144775 | -0.066046 | -0.218805 | ... | -0.042826 | 0.331412 | -0.128972 | 0.213607 | 0.094970 | -0.172642 | -0.022490 | 0.354137 | 0.298410 | 0.347919 |
| Population_x | 0.002143 | 0.017692 | 0.005893 | 0.130911 | -0.005898 | -0.040584 | 0.016784 | -0.062844 | 0.103447 | -0.026496 | ... | -0.021487 | -0.124958 | 0.697445 | 0.003707 | 0.013292 | -0.043567 | 0.119405 | -0.038034 | -0.047860 | 0.091874 |
| Year | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Life expectancy | 0.634044 | 0.539625 | -0.553037 | 0.565338 | 0.055552 | -0.032489 | 0.320719 | 0.442948 | -0.312382 | -0.333776 | ... | 0.045271 | 0.537568 | -0.267914 | 0.534340 | 0.463969 | -0.614532 | -0.065091 | 0.907300 | 0.818421 | 0.650942 |
| Adult Mortality | -0.432882 | -0.369972 | 0.403518 | -0.389155 | -0.046743 | 0.036476 | -0.166003 | -0.369649 | 0.306215 | 0.162789 | ... | -0.044914 | -0.398911 | 0.206128 | -0.412024 | -0.291840 | 0.674491 | 0.056156 | -0.654776 | -0.546813 | -0.413978 |
| percentage expenditure | 0.118921 | -0.008903 | -0.028851 | 0.077561 | -0.073124 | 0.006071 | -0.018125 | 0.321355 | 0.123583 | -0.010976 | ... | 1.000000 | 0.039545 | -0.017123 | 0.003127 | 0.039444 | -0.040954 | -0.021546 | 0.011676 | 0.015139 | 0.051198 |
| BMI | 0.437910 | 0.379134 | -0.444254 | 0.232045 | -0.113074 | -0.065394 | 0.273449 | 0.310507 | -0.180217 | -0.163590 | ... | 0.039545 | 1.000000 | -0.243160 | 0.307705 | 0.233360 | -0.308911 | -0.016152 | 0.628581 | 0.603645 | 0.367911 |
| under-five deaths | -0.199489 | -0.036428 | 0.173847 | -0.109087 | -0.062338 | -0.037703 | -0.263508 | -0.062141 | -0.001941 | 0.038290 | ... | -0.017123 | -0.243160 | 1.000000 | -0.161487 | -0.151832 | 0.126638 | 0.307181 | -0.250177 | -0.253128 | -0.055520 |
| Polio | 0.398435 | 0.289351 | -0.423549 | 0.274823 | -0.010877 | 0.037670 | 0.226785 | 0.325657 | -0.187155 | -0.264518 | ... | 0.003127 | 0.307705 | -0.161487 | 1.000000 | 0.639053 | -0.428879 | -0.265636 | 0.500279 | 0.413611 | 0.369507 |
| Diphtheria | 0.295216 | 0.248572 | -0.270659 | 0.222723 | -0.042181 | -0.026199 | 0.094194 | 0.315474 | -0.346686 | -0.202584 | ... | 0.039444 | 0.233360 | -0.151832 | 0.639053 | 1.000000 | -0.337005 | -0.078368 | 0.430368 | 0.388690 | 0.325217 |
| HIV/AIDS | -0.375809 | -0.317849 | 0.515659 | -0.389517 | -0.077028 | -0.047501 | -0.137876 | -0.312569 | 0.298875 | 0.109738 | ... | -0.040954 | -0.308911 | 0.126638 | -0.428879 | -0.337005 | 1.000000 | 0.042649 | -0.497048 | -0.396047 | -0.302289 |
| Population_y | -0.180596 | -0.084778 | 0.060297 | 0.057395 | 0.035187 | 0.023415 | -0.192964 | -0.122456 | -0.039800 | -0.009143 | ... | -0.021546 | -0.016152 | 0.307181 | -0.265636 | -0.078368 | 0.042649 | 1.000000 | -0.004567 | 0.013002 | 0.135824 |
| Income composition of resources | 0.701782 | 0.596489 | -0.632969 | 0.560508 | 0.010264 | -0.092409 | 0.405613 | 0.450697 | -0.271129 | -0.369430 | ... | 0.011676 | 0.628581 | -0.250177 | 0.500279 | 0.430368 | -0.497048 | -0.004567 | 1.000000 | 0.925844 | 0.696775 |
| Schooling | 0.629505 | 0.563103 | -0.571845 | 0.451998 | 0.001983 | -0.057522 | 0.372530 | 0.376794 | -0.253385 | -0.344473 | ... | 0.015139 | 0.603645 | -0.253128 | 0.413611 | 0.388690 | -0.396047 | 0.013002 | 0.925844 | 1.000000 | 0.700796 |
| Health_index | 0.515481 | 0.506550 | -0.444253 | 0.458177 | -0.068667 | -0.104336 | 0.244104 | 0.343129 | -0.175541 | -0.382370 | ... | 0.051198 | 0.367911 | -0.055520 | 0.369507 | 0.325217 | -0.302289 | 0.135824 | 0.696775 | 0.700796 | 1.000000 |
37 rows × 37 columns
merged_df['Recovered'].value_counts().plot.bar(title='Frequency Distribution of Recovered')
<matplotlib.axes._subplots.AxesSubplot at 0x123eb2a90>
c = merged_df['Pulses']
d = merged_df['Recovered']
plt.scatter(c,d, color = 'purple')
plt.xlabel('Heavy Pulse Diet')
plt.ylabel("Recovered")
plt.title('High Pulse Diet Related to Recovery Rate')
plt.show()
nutrition_df1 = merged_df.drop(['Deaths','Obesity',
'under-five deaths ',
'Recovered', 'Active',
'Country','under-five deaths ',
'Unit (all except Population)', 'Year',
'Status',' BMI ','Polio','percentage expenditure',
'Confirmed','Diphtheria ', ' HIV/AIDS',
'Population_x', 'Population_y',
'Life expectancy ', 'Adult Mortality',
], axis=1)
#nutrition_df= nutrition_df.drop(['treenuts_categorical'], axis=1)
#'Obesity', 'Deaths',
#'Recovered', 'Active', 'Population_x', 'Unit (all except Population)',
#'Year', 'Status', 'Life expectancy ', 'Adult Mortality',
#'percentage expenditure', ' BMI ', 'under-five deaths ', 'Polio',
#'Diphtheria ', ' HIV/AIDS',
plt.title('Nutritional Pairplotting of Different Diets')
sns.pairplot(nutrition_df1)
<seaborn.axisgrid.PairGrid at 0x1245b4450>
from sklearn import tree
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import accuracy_score
#X = merged_df[['percentage expenditure',' BMI ','Polio','Treenuts','Pulses','Meat','Vegetables','Health_index']]
#X = merged_df['Treenuts','Health_index'] #Dropping a bunch to check
merged_df_dropped = merged_df.dropna()
# X = merged_df_dropped[['Treenuts', 'Health_index']]
X = merged_df_dropped.drop(columns=['Deaths',
'under-five deaths ',
'Recovered',
'Country',
'Unit (all except Population)',
'Status',
'Confirmed',
'Population_x',
'Population_y'])
y = merged_df_dropped['Recovered']
#Split data to train and test with 20% sample
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=.2,random_state =5)
#regressor = DecisionTreeRegressor()
dtr_model = DecisionTreeRegressor()
dtr_model.fit(X_train, y_train)
DecisionTreeRegressor()
We will identify the most important features in the training set in our Decision Tree Regression model.
print(dtr_model.score(X_train, y_train))
1.0
print(dtr_model.score(X_test, y_test))
-0.6108109683200682
pd.Series(dtr_model.feature_importances_, index=X_train.columns).sort_values(ascending=False)
Life expectancy 4.344490e-01 Starchy Roots 3.403535e-01 Cereals - Excluding Beer 1.021883e-01 Active 3.891012e-02 Offals 2.390078e-02 Schooling 1.769176e-02 Vegetables 8.377621e-03 Fish, Seafood 6.602875e-03 Income composition of resources 6.562577e-03 Diphtheria 5.603951e-03 Spices 4.725922e-03 Fruits - Excluding Wine 3.783575e-03 Vegetable Oils 2.641331e-03 Eggs 2.057723e-03 Treenuts 1.012967e-03 Milk - Excluding Butter 8.068200e-04 Stimulants 1.720608e-04 Polio 5.463217e-05 BMI 4.997524e-05 Adult Mortality 2.979620e-05 Animal Products 7.596087e-06 Pulses 6.669819e-06 HIV/AIDS 6.386506e-06 Oilcrops 1.597028e-06 Vegetal Products 1.546011e-06 Health_index 9.333074e-07 Meat 5.404925e-08 Obesity 0.000000e+00 Year 0.000000e+00 percentage expenditure 0.000000e+00 Animal fats 0.000000e+00 dtype: float64
Our Decision Tree Regression model overfit the training data with a score of '1', while severely not fitting the test data above with -.01. We will run a second model: Linear Regression model. We will use the same target variable 'Recovered' and the above independent variables.
line.score(X_train, y_train)
0.3861260490273343
line.score(X_test, y_test)
0.20605411393167627
#What do our coefficients say regarding our target ('Regressor')?
import statsmodels.api as sm
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.metrics import mean_squared_error, r2_score
# We add a constant to the model as it's a best practice
# to do so every time!
X = sm.add_constant(X)
# We fit an OLS model using statsmodels
results = sm.OLS(y, X).fit()
# We print the summary results
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: Recovered R-squared: 0.385
Model: OLS Adj. R-squared: 0.184
Method: Least Squares F-statistic: 1.918
Date: Mon, 15 Jun 2020 Prob (F-statistic): 0.00963
Time: 18:26:47 Log-Likelihood: 145.39
No. Observations: 123 AIC: -228.8
Df Residuals: 92 BIC: -141.6
Df Model: 30
Covariance Type: nonrobust
===================================================================================================
coef std err t P>|t| [0.025 0.975]
---------------------------------------------------------------------------------------------------
Animal Products -0.0103 2.351 -0.004 0.997 -4.680 4.659
Animal fats 2.0768 1.646 1.261 0.210 -1.193 5.347
Cereals - Excluding Beer 0.3624 0.173 2.092 0.039 0.018 0.706
Eggs 2.0473 1.648 1.242 0.217 -1.226 5.320
Fish, Seafood 2.0922 1.647 1.270 0.207 -1.179 5.363
Fruits - Excluding Wine 0.3594 0.174 2.064 0.042 0.014 0.705
Meat 2.0782 1.647 1.262 0.210 -1.192 5.349
Milk - Excluding Butter 2.0744 1.646 1.260 0.211 -1.195 5.344
Offals 2.0540 1.651 1.244 0.217 -1.225 5.334
Oilcrops 0.3630 0.172 2.108 0.038 0.021 0.705
Pulses 0.4072 0.177 2.303 0.024 0.056 0.758
Spices 0.3491 0.176 1.980 0.051 -0.001 0.699
Starchy Roots 0.3638 0.168 2.164 0.033 0.030 0.698
Stimulants 0.3992 0.177 2.259 0.026 0.048 0.750
Treenuts 0.3615 0.173 2.088 0.040 0.018 0.705
Vegetal Products 1.7013 2.352 0.723 0.471 -2.970 6.373
Vegetable Oils 0.3651 0.173 2.111 0.037 0.022 0.709
Vegetables 0.3540 0.176 2.016 0.047 0.005 0.703
Obesity -0.0033 0.002 -1.744 0.085 -0.007 0.000
Active 0.0716 0.166 0.430 0.668 -0.259 0.402
Year -0.0514 0.058 -0.883 0.379 -0.167 0.064
Life expectancy 0.0041 0.003 1.205 0.231 -0.003 0.011
Adult Mortality -6.06e-05 0.000 -0.380 0.705 -0.000 0.000
percentage expenditure 1.844e-05 0.000 0.069 0.945 -0.001 0.001
BMI 0.0010 0.001 1.634 0.106 -0.000 0.002
Polio -0.0005 0.001 -0.932 0.354 -0.002 0.001
Diphtheria 5.715e-05 0.001 0.108 0.915 -0.001 0.001
HIV/AIDS 0.0099 0.009 1.041 0.301 -0.009 0.029
Income composition of resources 0.3613 0.240 1.506 0.135 -0.115 0.838
Schooling -0.0069 0.008 -0.819 0.415 -0.024 0.010
Health_index -0.0011 0.001 -1.050 0.297 -0.003 0.001
==============================================================================
Omnibus: 85.237 Durbin-Watson: 2.148
Prob(Omnibus): 0.000 Jarque-Bera (JB): 528.659
Skew: 2.408 Prob(JB): 1.60e-115
Kurtosis: 11.941 Cond. No. 1.10e+06
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.1e+06. This might indicate that there are
strong multicollinearity or other numerical problems.
We see that the score is much lower on our test set. This means that our second model also overfit the data to the training set.
Also, below, We see that our independent variable of interest "Tree Nuts" did not rank in the top 10. This means it does not have as much significance as we hypothesized.
The top ten nutritional features of importance (independent variables) are as follows for the Linear Regression Model: 1) 'Starchy Roots', 2)'Cereals - Excluding Beer', 3) 'Offals', 4) 'Vegetables', 5) 'Fish, Seafood', 6) 'Animal fats', 7) 'Spices', 8) 'Fruits - Excluding Wine', 9)'Vegetable Oils', 10)'Eggs'. 'Treenuts' did not rank high in the Decision Tree Regression model's features of importance.
However, in reviewing our Ordinary Least Squares model (Linear Regression) our'Treenuts' coefficient regressing on the target variable, 'Recovery', we calculated 0.3615 with a p-value of .04--which is pretty strong.
Because our two models (DTR and Linear Regression model) overfit our training data, we will use the Random Forest model (RFM) in addition to tuning our parameters. We selected RFM because it is ideal for larger data and estimates missing data while running a series of decision trees with different combinations of our features influencing (or not influencing) our target variable: COVID-19 'Recovery' rate. For example, we also dropped these features: 'Death', 'under-age five' deaths, and 'Confirmed ' because they are not in the sample of infected with COVID-19.
We will Uise GridSearchCV tool to tune parameters to help with overfitting in our final model.
from sklearn.ensemble import RandomForestRegressor
rfr = RandomForestRegressor(max_depth=2, n_estimators=1000)
rfr.fit(X_train, y_train)
print(rfr.score(X_train, y_train))
print(rfr.score(X_test, y_test))
0.7639240986409821 0.25121836550692256
from sklearn.model_selection import GridSearchCV
params = {
'n_estimators': [50,100,200,500],
'criterion': ['mse', 'mae'],
'max_depth': [2,3,None],
'min_samples_split': [1,2,3],
'min_samples_leaf': [1,2,4,6]
}
random_forest_grid = GridSearchCV(RandomForestRegressor(), param_grid=params, verbose=1, cv=3)
random_forest_grid.fit(X_train, y_train)
Fitting 3 folds for each of 288 candidates, totalling 864 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers. [Parallel(n_jobs=1)]: Done 864 out of 864 | elapsed: 5.7min finished
GridSearchCV(cv=3, estimator=RandomForestRegressor(),
param_grid={'criterion': ['mse', 'mae'], 'max_depth': [2, 3, None],
'min_samples_leaf': [1, 2, 4, 6],
'min_samples_split': [1, 2, 3],
'n_estimators': [50, 100, 200, 500]},
verbose=1)
random_forest_grid.best_params_
{'criterion': 'mse',
'max_depth': 3,
'min_samples_leaf': 6,
'min_samples_split': 2,
'n_estimators': 50}
random_forest_grid.best_score_
0.3370534915128684
better_rfr = RandomForestRegressor(**random_forest_grid.best_params_)
better_rfr.fit(X_train, y_train)
print(better_rfr.score(X_train, y_train))
print(better_rfr.score(X_test, y_test))
0.5116605901627048 0.31966375847390704
pd.Series(better_rfr.feature_importances_, index=X_train.columns).sort_values(ascending=False)
Life expectancy 0.527670 Income composition of resources 0.221833 Active 0.054493 Cereals - Excluding Beer 0.052489 Animal Products 0.036105 Pulses 0.020531 Vegetal Products 0.015512 Fish, Seafood 0.015260 Animal fats 0.012240 Stimulants 0.010632 Vegetable Oils 0.008349 Starchy Roots 0.006505 Milk - Excluding Butter 0.005257 Polio 0.004526 Offals 0.001686 Fruits - Excluding Wine 0.001521 Health_index 0.001457 BMI 0.001361 Diphtheria 0.000926 Vegetables 0.000578 Treenuts 0.000389 Schooling 0.000368 Oilcrops 0.000313 Eggs 0.000000 HIV/AIDS 0.000000 percentage expenditure 0.000000 Meat 0.000000 Adult Mortality 0.000000 Obesity 0.000000 Year 0.000000 Spices 0.000000 dtype: float64
(pd.Series(better_rfr.feature_importances_, index=X_train.columns)*100).sort_values(ascending=True).plot(kind='barh', title='Feature Importance Among Selected Factors (in %) - Higher is Better')
<matplotlib.axes._subplots.AxesSubplot at 0x141399750>
In the Random Forest Model, our top ten features of importance do not list 'Treenuts'. We see that our independent variable of interest "Tree Nuts" did not rank in the top 10. This means it does not have as much significance as we hypothesized.
The top 10 dietary items include: 1) 'Life expectancy', 2) 'Pulses', 'Deaths', 'Cereals', 5) 'Schooling', 6) 'Animal fats', 7) 'Stimulants', 8) 'Milk - Excluding Butter', 9) 'Vegetables', and 10)'Offals'.
Thressholds: Organizing Numerical Values in Diet into Categorical for Comparison
We will set a thresshold of 'Recovery' to 10 percent to see countries that fall into that classifying group.
merged_df[merged_df['Recovered'] <= 0.1]
| Country | Animal Products | Animal fats | Cereals - Excluding Beer | Eggs | Fish, Seafood | Fruits - Excluding Wine | Meat | Milk - Excluding Butter | Offals | ... | percentage expenditure | BMI | under-five deaths | Polio | Diphtheria | HIV/AIDS | Population_y | Income composition of resources | Schooling | Health_index | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 21.6397 | 6.2224 | 8.0353 | 0.6859 | 0.0327 | 0.4246 | 6.1244 | 8.2803 | 0.3103 | ... | 71.279624 | 19.1 | 83 | 6.0 | 65.0 | 0.1 | 33736494.0 | 0.479 | 10.1 | 32.3 |
| 1 | Albania | 32.0002 | 3.4172 | 2.6734 | 1.6448 | 0.1445 | 0.6418 | 8.7428 | 17.7576 | 0.2933 | ... | 364.975229 | 58.0 | 0 | 99.0 | 99.0 | 0.1 | 28873.0 | 0.762 | 14.2 | 52.9 |
| 2 | Algeria | 14.4175 | 0.8972 | 4.2035 | 1.2171 | 0.2008 | 0.5772 | 3.8961 | 8.0934 | 0.1067 | ... | 0.000000 | 59.5 | 24 | 95.0 | 95.0 | 0.1 | 39871528.0 | 0.743 | 14.4 | 23.6 |
| 3 | Angola | 15.3041 | 1.3130 | 6.5545 | 0.1539 | 1.4155 | 0.3488 | 11.0268 | 1.2309 | 0.1539 | ... | 0.000000 | 23.3 | 98 | 7.0 | 64.0 | 1.9 | 2785935.0 | 0.531 | 11.4 | 25.2 |
| 4 | Antigua and Barbuda | 27.7033 | 4.6686 | 3.2153 | 0.3872 | 1.5263 | 1.2177 | 14.3202 | 6.6607 | 0.1347 | ... | 0.000000 | 47.7 | 0 | 86.0 | 99.0 | 0.2 | NaN | 0.784 | 13.9 | 29.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 139 | Uruguay | 25.5069 | 3.4811 | 2.5698 | 1.2804 | 0.3281 | 0.1777 | 12.2841 | 8.0603 | 0.0729 | ... | 0.000000 | 64.0 | 0 | 95.0 | 95.0 | 0.1 | 3431552.0 | 0.794 | 15.5 | 41.3 |
| 140 | Uzbekistan | 25.9903 | 2.4884 | 2.7168 | 1.0639 | 0.0962 | 0.5830 | 10.3624 | 11.8050 | 0.1743 | ... | 0.000000 | 44.7 | 17 | 99.0 | 99.0 | 0.1 | 312989.0 | 0.697 | 12.1 | 34.3 |
| 142 | Yemen | 12.5401 | 2.0131 | 11.5271 | 0.5514 | 0.3847 | 0.2564 | 8.0010 | 1.3463 | 0.2436 | ... | 0.000000 | 41.3 | 47 | 63.0 | 69.0 | 0.1 | NaN | 0.499 | 9.0 | 18.5 |
| 143 | Zambia | 9.6005 | 1.6113 | 14.3225 | 0.6266 | 1.0070 | 0.1343 | 4.9010 | 1.2756 | 0.1790 | ... | 0.000000 | 23.4 | 40 | 9.0 | 9.0 | 4.1 | 161587.0 | 0.576 | 12.5 | 28.7 |
| 144 | Zimbabwe | 10.3796 | 2.9543 | 9.7922 | 0.3682 | 0.2455 | 0.0614 | 4.5674 | 2.1040 | 0.1315 | ... | 0.000000 | 31.8 | 32 | 88.0 | 87.0 | 6.2 | 15777451.0 | 0.507 | 10.3 | 38.2 |
119 rows × 40 columns
We are curious about a region with a high 'Treenuts' content in their daily diet. So we selected the Eastern Mediterranean region from the World Health Organization grouping.
We will set a thresshold of 'Treenut' to .015 (1.5 percent daily intake) to see countries that fall into that classifying group of consuming tree nuts and level of incorporating.
#Inserting top ranked countries w/highest number of COVID-19 cases (Italy, US, and Brazil(more recently))
chart = sns.catplot(x="Country", y="Recovered", hue='treenuts_categorical', kind="bar",
data=merged_df[merged_df['Country'].isin(['Italy',
'United States', 'Brazil','Iran','Lebanon', 'Afghanistan','Kuwait',
'Pakistan', 'Saudi Arabia','Jordan','Syria','Yemen','Egypt',
'United Arab Emirates','Oman','Bahrain','Qatar','Morocco','Libya',
'Tunisia','Iraq'])]);
chart.set_xticklabels(rotation=45, horizontalalignment='right')
plt.title('Recovery Rate Across Eastern Mediterranean Countries Regarding Treenuts Diet')
Text(0.5, 1, 'Recovery Rate Across Eastern Mediterranean Countries Regarding Treenuts Diet')
After running three models and tuning our parameters, we improved both our training and test set scores. For example, our train score of 0.512 improved from 0.3861 in our LR model. Meanwhile, our Random Forest Regressor model's test of 0.386 improved upon our LR model's lower test score of 0.206 after parameter tuning.
Our Random Forest Model dealt with the entropy, or noise, that is introduced when many features are included in a model to measure our target of COVID-19 recovery rate as 'Recovered'. Many factors play a role. Surprisingly, the Health Score Index did not carry the majority of the explanatory power. As seen in our first visual, the scatterplot did not show a one to one correlation. Countries with a medium Health Score Index (HSI)had higher COVID-19 recovery rates compared to some countries with a higher HSI. At this point, we considered additional factors beyond state investement, like nutrtition and regional dietary choices. We selected a region, Eastern Mediterranean, to review their nutritional impact on 'Recovered' rate by considering how nutrition may build immunity to stave off viruses. The Eastern Mediterranean region emerged as a choice because they experienced the Middle East Respiratory Syndrome in 2015.
Subpoints to note show that the Eastern Mediterranean countries included (Afghanistan, Egypt, Iraq, Jordan, Kuwait, Lebanon, Morocco, Oman, Pakistan, Saudi Arabia, Tunisia, United Arab Emirates, and Yemen) present a range between .076 (Yemen) to 3.82 (UAE) regarding 'Treenuts' composite in diet. This range may be attributed to income per capita. Specifically, the lowest end of spectrum is Yemen, a lower-income country, in contrast to the highest end of spectrum UAE, a high-income country. We would need to do a comparative analysis to see if the same trend occurs with this diet in another region and see if income disproportionally affects the consumption of tree nuts.
Upon reflection, we could have controlled for income and its access to nutrition. More specifically, nutrition includes higher-priced food items that are more expensive to transport from mountainous to desert regions. If we were to run a regression model on nutritional items and price, consuming 'Tree nuts' (almonds, pine nuts, hazlenuts, pistachios and walnuts) would present a premium.
To better test the regional nutritional theory positively influencing the COVID-19 recovery rate, we should have sampled the top 10 percent of the population in each Eastern Mediterranean country. Then we would have been better compare the access to high-priced food items, like 'Tree nuts' category, which are more readily available and prepared in higher-priced dishes, like desserts and meat-based rice. These are dishes that populations below the poverty line cannot afford, and thereby cannot easily incorporate into their daily diets to build immunity and ultimately lead to likely recovery--if infected by COVID-19.
Additionally, we can supplement our nutritional information by including total protein contents across nutritional categories and combine with features covering percentages of energy (in kilo calories) consumed from each type of food listed. Then, we may conduct a Principal Component Analysis across the subcategories of features across nutrition to give a more precise slice of nutrition per region across the upper 10 percent of the population.